The Guide to Practical and Pragmatic IT Architecture Design

Input to IT Architecture: Technical Requirements

The most important requirements for technical design are the technical requirements. These needs consider the technical constraints and limits under which the solution needs to be able to operate. Imagine in building design, the technical requirements for a building are the operational limits under which the construction needs to comply with, such as number of people, water, gas electricity usage, number of floors etc. And the same applies to large IT systems.

When talking about technical requirements, typically we focus on the 7 main -ilities that categorize these considerations:
1. Scalability & Performance 
2. Availability 
3. Maintainability & Flexibility
4. Security
5. Operability
6. Interoperability & Portability
7. Usability

Each of these considerations need to be analyzed and the requirements need to prioritized in high, medium, low and nice-to-have. In this process, the requirements priority is typically distributed with 15% high, 50% medium, 25% low and 10 % nice-to-have. These requirements will form the input for the solution fundamentals as well as the costs and effort to build the application.

Technical Requirements Software



Each of the requirements are detailed here below:

Performance & Scalability Requirements

Performance and scalability are one of the most critical technical requirements that needs to be analyzed well before starting any design.

Performance 

If a building design is wrongly estimated for the number of occupants, it may overpopulate hallways, slow down traffic flows inside the building and finally collapse under its load. Not something an architect wants to be responsible for. And for IT, it is exactly the same. 

To define performance requirements, one needs to define response times of a system for certain number of medium and complex transactions. This could be for instance a transaction to login, to query a catalog, obtain a quote, confirm the purchase of a ticket. 

End-to-end Response time for specific transactions 

One needs to define a set of specific transactions that can be measured from start to end. A transaction typically starts with a user hitting enter or clicking on a button and ends with a screen result. But it could also be an interface, a batch process or any other transaction that could be triggered by a user, a specific event, process or circumstances, e.g. a stock price hitting a certain threshold price. 

As complex systems´ transactions depend on many different data sets and logic, network response time, each transaction response time will vary and to expect and predict with 100% certainty a system response time is impossible. Therefore, one needs to define response time as a 80% average response time of a transaction, meaning that of running the same transaction. A typical end-to-end response transaction time is 2- 3 seconds, but it depends on the specific transaction and industry. An example of defining performance response times is shown here below:

 ModuleTransaction Expected Response Time 
 Payment Login        <1,5 seconds for 80% of transactions    
  Validate Client Details<1,5 seconds for 80% of transactions        
  Validate Payment    <5 seconds for 80% of transactions     
  etc 

The set of transactions cover mostly 15-25 key transactions but could include more dependent on the system complexity. However, in practice, a list rarely exceeds more than 60 identified transactions. 
In some cases of more complex transactions, there is certain user interaction required. In those cases, a think time needs to be defined how long a user on average needs to think and respond to the application, but this makes measuring the application response time more difficult and this is not required for an architecture design.

Volume Conditions

A system will be fast if no other processes are running or the database is still very empty. But once the application will be used, more users will use the system and transactions will be processed in parallel, other processes are performed and the database will be filled with more data. Therefore, defining these response times, one needs to assume that the response times are defined for the system running at a normal day after one or more years. We will go into more detail in this as part of performance testing the solution and its architecture.  

Scalability

Scalability is another key aspect for defining a system architecture. While we already mentioned the importance of performance and response times, an architect needs to also plan for expansion or increased loads of data data, number of transactions or usage, which is called scalability. 

Scalability looks at how it can accommodate increased volumes in the future as well as it can handle certain current volume peaks concurrently without slowing down or breaking the system. Input for understanding how a system can accommodate the maximum throughput, several key volume metrics need to be analyzed such as number of users, clients, transactions, queries etc. The business typically can provide these numbers, or they can be extrapolated from the current systems.
 
Another dimension of analyzing volume metrics is the expected annual growth of these volumes. A business may expect that it grows 8% YoY (year on year) and therefore the number of transactions, queries and data may be expected to grow in similar rates. However, for certain scenarios, there may be non-organic growth if the business expects to acquire other companies that needs to use the same system, or the company may want to facilitate selling its software once acquired by another company and it needs to host the additional clients as well. 

A volume metrics table focuses on the actual volumes as well as the metrics for the next 3 years. Most IT solutions are designed to maintain a life of 3 years, but in some cases, this may be extended to 5 or 7 years. As technology evolves over time, more than 5 years is rare as most solutions will have a new release, technology and architecture over that time. 

A volume metrics table has to look at the 3-8 key volume metrics such as number of users and orders that size the system and looks like the following:

  20202021 2022 
                Concurrent /Peak  per hour   Concurrent /Peak    Concurrent /Peak   
 #Users 80,000 /120,000 88,000/132,00095,000/145,000 
 #Sales Orders    6,000 /9,000 6,600/10,000 7,200/12,000
 Etc   

The table shows the expected key volumes for a certain business line. It needs to show concurrent as well as peak. Concurrent is the number of transaction that run in parallel, while peak is number of transactions that run at peak of the day or month. For instance, a transaction may run 1,000 times in a day, but has a different impact if it is distributed over 24 hours or in batch window within 1 hour. So therefore, the peak shows how many transactions are expected for a specific hour, minute or even second if there is high performance required. 

Later in the design of the architecture we investigate how we can design system scalability based on this input.

 Technical RequirementsInput Considerations
 Performance
  • End-to-end response time for defined set of transactions
  • 80% expected
  • Under which Volume conditions
 Scalability
  • Volume metrics,
  • Actual and at least for next 3 years,
  • Day, month, annual and peak

Availability

Availability is another key architecture requirement and is part of software reliability. Availability is the time that a system is available to users and processing. It is not only limited to the user’s availability, but also for batch processing in the night. There are several key metrics that needs to be defined in this area.

The first one that needs to be clearly defined is what is the availability window is. For highly critical systems, this could be 7 days x 24 hours availability such as (international) websites and stock exchange platforms. For less critical systems, this could be 5 days x 8 hours per day and it could be something in between. One needs to consider here that there needs to be space for planned downtime due to maintenance or offline backups and that needs to be defined as well. 

The other dimension is the availability percentage of the system, defined typically as the maximum number of hours a system can be offline in total for a given time. This is the number of hours divided by total number of hours in the availability window and expressed in percentage. Normal values are 98% for low critical systems, 99.5 for more critical systems or even higher such as 99.995%. Note that the higher the availability percentage, the more complex and expensive the system will be. A full 100% availalibility is unfeasible to define as target as there is always a certain time to switch to a backup or recovery system, but for instance an availability percentage of 99.9999999% ("nine nines") would give an allowed downtime of 86 micro seconds per day, that would be unnoticeable by any user. 

One additional requirement is the maximum downtime requirement for one incident as definitions like a maximum downtime of 2 hours can be taken as 12 incidents of 10 minutes downtime, 6 incidents of 20 minutes as well as the worst case of having one incident of 2 hours unavailability. During design phase, systems are built with Meantime to Repair (MTTR), while the overall measured system´s uptime is referred as Meantime between failures (MTBF). However, as principal requirement one needs to look first at % availability, total allowed downtime and maximum allowed downtime for a single incident.

In the architecture design chapter, the impact of (high) availability in the IT architecture design is detailed in depth.

 Technical Requirements Input Considerations
 Availability
  • Availability Window, 
  • % Availability / Total allowed maximum downtime, 
  • Maximum allowed downtime per incident

Maintainability & Flexibility

Maintainability & flexibility require that a system can be maintained easily from an operational support perspective as well be flexible from a development side. 

From a development point of view, a system needs to allow that any functional or technical enhancements can be done or added without any major redevelopment. The system architecture needs to provide a platform that allows the specific changes to be done in an agile and timely manner, such as allowing that:
  • Business processes can be modified through an orchestration tool. This could be that maybe steps in a process need to be modified or changed in order, or that roles need to be updated. 
  • Specific business, functional parameters or rules can be changed through an administration panel. These include for instance changing certain thresholds such as at what level a loan could be risky or rules and conditions that apply to approve a visa approval.
  • Sharing and/or re-use of functionalities, modules, components or specific services
  • Minor bug fixing, and small enhancements can be resolved and deployed rapidly. What needs to be considered here is what is the expected deploy time from development through test and deployment of a new small or medium-sized feature and needs the feature to be without rebooting the server.
  • New developers can be added to the application team without any major training or coaching. 
The main objective is to avoid hard-coded or permanent structures in an application that needs to be redeveloped for small changes. 

Maintainability is the other aspect of requirements that needs to address the functional maintenance of a system in production when it runs. It focuses on technical as well as functional support and covers items such as:
  • Content such as text, images, videos, colors, fields can be changed in the application 
  • Multiple translations for screens can be managed without having separate versions for each language
  • New support people can be added to the support and maintenance team without any major training or coaching
Maintainability and flexibility improve the agility, software quality, productivity and will lower significantly the development and maintenance costs if designed well. In the architecture design this will be further detailed how applications can be designed to properly address these concerns. 

 Technical Requirements Input Considerations
Maintainability and Flexibility
  • Ability to change and orchestrate business processes, 
  • Ability to make rapid changes to business or functional parameters, rules, thresholds, 
  • Re-usability of functionalities, components or services, 
  • Required agility for Minor Enhancements in the system in production (e.g. less than 48 hours), 
  • Average time to deploy from medium sized feature from test to production 


Operability

When talking about system´s operability requirements, one needs to address how easy an application can be operated from a technical support team This includes that the system can operate in production with little problem, but that a technical operations team can also support the development and testing of the solution. The following requirements need to be considered:
  • Need for application monitoring, alerts, logging and audit trails
  • Ability to back-up and restore or recover specific application data, state of a specified time
  • Scheduling with specific triggers such as events or start/stop times can be adjusted in a specific system window
  • Underlying software can be parameterized or upgraded without any major impact on the system. This is a stack of different products, such as the operating system, database technology, programming language etc.
  • Application has its own development lifecycle software support. This is to ensure that application development can properly be supported by operations, that cover software versioning, configuration management, automated deployments etc.
As in previous sections we highlight the most important inputs for the operability requirements:

 Technical Requirements Input Considerations
 Operability
  • Needs for application monitoring, alerting and logging, 
  • Ability to restore or recover previous data or state of specified time, 
  • Scheduling capability, 
  • Requirements for underlying operating system, programming language, database technology, 
  • Development lifecycle support, 
  • Need for mature operating systems, database technologies, programming language

 Interoperability & Portability

Interoperability describe the characteristics how an application can connect, communicate and exchange data with other systems. What need to be understood here are typically:
  • the required integrations and connections with other systems, and 
  • what type of interconnectivity, integration protocol and standards are required, 
  • what data needs to be exchanged, data format and data model. The data model may depend on specific industry reference models, such as banking, insurance, airline, health, telecommunications that have their own data model and semantics,
  • Ability for third parties to use the system and integrate in an eco-system.
The importance of these requirements facilitates an easier integration with less complications with other systems and applications and less dependence of the new application as it can be replaced by another system in the future if needed. 

An application that has high portability ensures that the solution can run on other platforms. The importance of portability requirements is not only that there are different options for developers to develop the solution on, but also that it avoids the potential risk of vendor lock-in and availability, for instance with the usage of cloud platform (e.g. Amazon, Azure etc.). 

The portability reaches the full stack of products such as:
  • Channels (such as Web browsers (e.g. IE, Chrome, Firefox etc.), tablets, kiosks, ATMs etc.)
  • Database (SQL, Oracle, DB2, etc.)
  • Operating system and infrastructure (Linux, Windows, As/400, etc)
  • Cloud infrastructure or cloud services (Amazon, Azure, Google, etc.)
  • Backward compatibility requirements with existing products or platforms
A common portability requirement is that the solution can run on different operating systems and databases. Another more actual requirement is the use of different infrastructure or cloud services. For instance, how easy can an application that has been built on Amazon cloud services be migrated to the Azure cloud, that can be to have a secondary cloud provider or negotiate better commercials. 

Programming platform portability (e.g. Java and .Net) is rarely seen as a requirement as the effort for this typically results in a rewrite of an application and would be costly and in many cases non-feasible.

 Technical Requirements Input Considerations
 Interoperability
  • Integrations with other systems,
  • Type of inter-connectivity, protocol and standards, 
  • Data format and model,
  • Ability to integrate in third party eco system
 Portability
  • Channels, 
  • Database,
  • Operating system and infrastructure,
  • Cloud,
  • Product or platform backward compatibility

Security

Security is an area that is usually overseen when it comes to requirements gathering. In some cases, it is seen as part of a functional area or due to priority and time, it is left to the end or looked at when building the application. There are several aspects in security to ensure  that the application and data can only be read and modified by intended use.
 
The first group of requirements is to secure the application against external attacks from hackers and malicious clients. Most typical vulnerabilities here are buffer overflows, SQL injection, and session hijacking which need to be reviewed and typical requirements here to be identified are who and how the system interacts with users and where can the application be accessed from. A public application with mobile or web or front-end requires a more stringent security architecture than a personal internal desktop application for instance.

Another security area is the regulatory, compliance and privacy considerations that need to be complied with. These could be specific government regulatory requirements provided at country or international level (e.g. European Union), specific industry sector compliancy (e.g. Sarbanes Oxley, credit card data) or institutions. 

And the third area are whether there are specific needs for specific functionalities or data protection within an application for specific users or roles. For instance, a portfolio manager role can only see the data of his/her clients, but not from other portfolio managers. Or a helpdesk employee can only do certain queries, but not open an account. 

To identify Software security requirements, one needs to understand the main requirements as the architecture may become very different dependent on the specific needs.

 Technical Requirements    Input Considerations
 Security
  • Who, how and where interacts the application, 
  • Regulatory, compliance and privacy needs, 
  • Specific user and/or role access to application data and functionalities,
  • Specific Data protection and Integration Requirements

Usability

Usability describes the application accessibility and interactivity with its audience. It is another architectural area that is frequently overlooked as many consider it part of functional design rather than technical. However, part of usability are features and technical functionalities that require a certain underlying IT architecture design. 

A good example is the user Interface choice such as web, mobile or custom application design. Another area is the accessibility for people with disabilities that need a better articulated screens or audio and may result in additional architectural components. 

Considerations that need to be analyzed as part of usability are:
  • Type of user interface is required to interact with users (e.g. mobile screen, web, desktop, ATM, kiosks, interactive augmented reality, etc.) 
  • Type of interaction flow (vertical/horizontal scrolling, interactive) and flexibility in look & feel in application screens 
  • Integration of multimedia or plugins such as video players, maps
  • Specific impairment needs (cognitive, visual, auditive, motor) 
  • Specific needs for control or feedback interactivity such as assistance, help, notification and error messages with e.g. pop-up windows, virtual assistant, etc.
These requirements direct certain architecture choices and design that need to be detailed in the next phases of the project.

 Technical RequirementInput Considerations 
 Usability
  • User Interface type(s),
  • Interaction flow and look & Feel,
  • Multimedia / Plugins integration, 
  • Specific Impairment needs,
  • Specific control or feedback interactivity needs

Other -ilities

There are other categories of technical requirements, and we will discuss them in short, because either they are a combination of the already mentioned above or are less relevant for architecture design:

 Technical Requirement CategoryRelated to or combination of: 
 ReliabilityAvailability, scalability and maintainability    
 AgilityFlexibility and Maintainability
Testability Maintenance and Operability
Extensibility Interoperability
ReusabilityFlexibility
 Backward compatibilityPortability
 Recoverability Availability

No comments: