How Risk Assessment can help you to Manage OT/ICS Cyber Security

Optimising cyber security operational cost is a bigger issue today than implementing the security controls in OT environment. The ultimate objective of this blog series is to understand the process of optimising OT security operational cost while minimising exposure to cyber security exposure. We will understand how performing risk assessment and designing cyber reference architecture has extreme potential to help critical infrastructures realising huge gains.

Remember the ultimate objective is to manage complex OT security while minimising operational cost.

We will take a step by step approach to understand how to perform risk assessment and design security zones and conduits. IEC 62443 group of standards can assist in each step of the process beginning from selection of SuC until evaluation of cyber security requirements specifications and selection of security technologies.

Source:DNV-GL

Step 0: Talk to management, IT/OT engineers and understand what is most important for them.

Step 1: To secure and optimise assets we need a detailed inventory, asset owners/engineers need to list down the assets in ABC Gas company to form a detailed inventory, as below. (assets listed below are fictional like superman 😃). I have written another blog on how we can utilise NIST 1800-023 for managing the complex portfolio of IACS assets.

Step 2: Prepare the initial system architecture, i.e. map the OT assets to ISA-95 purdue reference model with respect to zones they reside in. During this stage it is wise to involve your network, system and security engineers in order to understand how the your environment functions. Look at incibe-cert blog on brief details on how to partition system in various layers of functions.

Step 3: Perform inherent and detailed risk assessment as part of designing purdue reference architecture and derive asset risk profile following IEC 62443-3-2 for the reference.

Step 4 Design security zones and conduits i.e. identify and understand asset interactions, over which protocol, through which network channels, using what services, utilising what routes etc. A brilliant example of documenting zone and conduits related details can be found in Tofino security white paper (below is the screenshot for reference from the paper).

Source: Tofino Security

Before we jump into documenting the information from zones and conduits it will be wise to first have a visual understanding of how each asset talks to each other. Tofino security white paper also provides a guidance on how to start visualising and documenting those details.

Step 5: Once you have captured a visual representation of your environment you can prepare an excel file depicting below details. Consider the below mentioned column attributes (based on guideline provided in IEC 62443-3-2) to cover holistic level of details.

Update your cyber reference model reflecting zones, conduits with risk ratings, from the risk assessment activity.

  • Name or unique identifier of the zone/conduit
  • Logical limits
  • Physical limits, if applicable
  • List of all access points and all the assets involved
  • List of all types of data flows/protocols associated with each access point
  • Connected zones and conduits
  • Asset list
  • Assigned security level

Step 6: (Cyber Security Requirement Specification) Based on the above architecture and risk ratings we will now derive cyber security requirement from IEC 62443-3-3 standard e.g. vulnerability management, patch management, user access management, etc. For instance an asset in Zone C with high SL-T should have 2 factor authentication along with domain authentication and needs to be patched on regularly. Below is the list of foundational requirements from IEC 62443-3-3.

  • FR 1 – Identification and authentication control
  • FR 2 – Use control
  • FR 3 – System integrity
  • FR 4 – Data confidentiality
  • FR 5 – Restricted data flow
  • FR 6 – Timely response to events
  • FR 7 – Resource availability

This phase will asset us in selection of technologies which can deliver required services based on our evaluation. For e.g. we will be able to understand whether automatic patch management solution such as WSUS would be the right fit for us or not. The SANS whitepaper is a great resource to identify and assign which security services is required at which SLT level.

Cost benefit analysis of our efforts can help us validate and justify, how million of dollars can be saved each year through following a structured approach of managing OT cyber security. Below is an hypothesis where reduction in cost is observed through formulating risk ratings for our OT assets and simply identifying where do we really need to put controls in.

As we have discussed from the beginning ABC Gas Company incurs heavy operational costs when it comes to patch management process. OT Engineers at the company works very hard every day to keep up with the cyber security threats. They maintain the patch level of the systems up to date. But the approach is not optimized as the engineers do not know what to focus on and what not to, they put equal efforts in securing every IACS asset.

For e.g. OWS 1 is used to monitor wind speed data for HSE purposes, and is segregated from other networks. Even if the HMI gets compromised it will not have any significant affect on gas production units. So, patching the HMI on a regular basis may not be required, as it do not pose any severe cyber risk to the business and plant operations.

Before the risk assessment and security architecture activity was completed the overall cost to maintain patch level was 430 USD every month (refer below for details). As all assets were seen from one eye and valued equally, because of which efforts were not optimized i.e. assets which do not value much to the organization as compare to others were also included as part of regular patch cycle, thus increasing the operational costs.

Through careful analysis of risks OT systems are exposed to and what services are really required and where, engineers were able to minimize the risk exposure and decrease the cost required to maintain assets. For e.g. after the risk assessment activity OWS 1 is now rated as Medium, so it is safe to revise the patch cycle for the HMI, i.e. the HMI monitoring wind speed can be patched once a month or on a quarterly basis, as it does not serve a critical purpose when it comes to gas production.

Based on the above analysis the new estimate cost required to manage the OT assets patch management process is 305 USD, which heavily relies upon the risk profile data derived by the engineers. Please refer to below example calculation and sheet for your reference.

Scenario Based Example

Now let’s use the above model and fit it into a sample OT assets model. Lets say ABC Gas Company has around 2000+ OT assets, which includes 1000+ workstation and servers, 500+ network equipment, 500+ PLCs. Maintaining a system incurs manpower cost, energy cost, vendor cost (if applicable), down time cost, etc.

Typically to patch a windows system it takes around 20 mins of time (including transfer of manual patches, taking system offline, restarting and considering if every thing goes well), and windows system in OT would be patched at least twice in a year, because of being a prime target. For a network device (router, switch or firewall) time could range between 15-30 mins, we will take an average of 20 mins for our hypothesis. For PLCs it could take around 20 minutes to an hour depending upon the product, we will take the average time of 40 minutes for our hypothesis.

  • 1000 windows machines * 20 minutes * 2 (considering if systems are patched twice a year) = 40000 minutes (5000 man hours or 208 man days).
  • 500 network equipment * 20 minutes = 10000 minutes (1250 man hours or 52 man days). Considering network equipment would be patched once in two years, the number of days required each year would be 52/2=26.
  • 500 PLC * 40 minutes = 20000 minutes (2500 man hours or 104 man days). Considering PLCs would be patched once in three as these devices as the patches are not released a often as other assets. So the number of days required for 1 year would be 104/3=34 each year.
  • Total man days needed in a year to patch OT assets = 208+26+34 = 268 (round off to 280 man days considering miscellaneous scenarios, which may interrupt patch operations).
  • Let’s say, to apply a patch costs around 70 USD/hour (inclusive of manpower/salary cost, vendor consultation, yearly subscription, downtime of OT system).
  • 280 man days each year is required for patching, which translates to 2240 hours each year.
  • Total cost spent incur in patching OT assets = 2240 hours * 70 USD = 156,800 USD

If we keep looking every asset as critical, and from one eye we will be spending 156,800 USD a year just on patching our OT assets (considering no emergency patches were applied, vendor did not release an advisory and no severe hacks were observed in critical infrastructures).

Similarly there are another several areas in OT security (Security Monitoring, Network Security, Vulnerability Management, Active Directory, Security Baseline Implementation), which are as important as patching the assets and would demand equivalent efforts and budget. Also, depends how pro-active a certain organisation is practicing cyber security.

If we want to minimise our costs, the ultimate idea is to perform a detail security risk assessment across OT assets and develop a purdue reference architecture, with no exceptions, if maturity and cost optimisation is expected. Once completed the cost benefit realization would be at least 30%-40% (as everything is not important and does not require immediate attention at all times).

The best part is that the budget required to implement the plan would be close to zero, as engineers would have all the knowledge and expertise required. The main idea is to align our efforts into achieving maturity we intend to, utilizing existing resources and man power.

Conclusion: Cyber security is not one person’s job and when it comes to OT security it gets more complicated. The best proposal is to break the complex OT security process into manageable and smaller pieces. Perform a detailed risk assessment and designing purdue reference architecture has the potential to manage complex OT security holistically.

Several processes in OT cyber security needs attention and optimisation such as logging and security monitoring, backup and restore, network security, incident management etc.

Note: The concept detailed above with the assets and zones does not reflect any organizations actual assets or their documentation. Relation between the fictional assets, zones, conduits, definitions with another entity would be entirely co-incidental.

References:

Leave a comment