25Minutes: Insights. Expertise. Impact.

8 - Qusai AlRabei: How to design and run an effective OT SOC & lessons from the field that can save millions

Eliel Mulumba

What does it really take to build and operate an OT Security Operations Center (SOC)? In this episode of 25Minutes, I sit down again with Qusai - an expert with hands-on experience setting up OT SOCs, including his first major project in the Middle East. We explore the key differences between OT and IT incident response, the unique challenges of industrial environments and why traditional approaches often fall short. Qusai shares common mistakes companies still make, how to develop tailored use cases and playbooks and which parameters matter most when deciding between an OT SOC, IT SOC, or a hybrid model. If you're in cybersecurity, industrial operations or simply want to understand the nuances of securing operational technology environments, this episode is packed with actionable insights and lessons learned from the field.

Important note: The views and opinions expressed in this episode are solely those of the individuals involved and do not necessarily reflect those of any organization, employer or affiliation. 

Our Guest: 

LinkedIn: https://ae.linkedin.com/in/qusai-alrabei-cybersecurity

https://www.weforum.org/stories/2023/12/why-securing-the-ot-environment-is-important/

25 Minutes Podcast

Hostey by: Eliel Mulumba

Audio editing & mastering: Michael Lauderez

Join conversation on LinkedIn: www.linkedin.com/in/eliel-mulumba-133919147

Send us a text

 It's again, a really pleasure to welcome you as a guest on our show for 25 minutes, because we had a great first episode, the feedback was amazing. we both felt that there are a couple of topics that we can still elaborate on when it comes to industrial control systems, specifically asset protection, incident response, how to set up an OT security SOC and how actually. The future of OT security looks like, and to start with, we talked a bit about asset inventory, um, during the last time. And I would like to understand how you actually see the asset protection part today, especially when it comes to ICS environments that have a life cycle of 20 to 30 years. is very long, where you cannot just plug and play and install a new patch on a running, uh, systems that is operating. so how can security updates be implemented without disrupting operations from your point of view? Um, Uh, Uh,

about. Let me start from the, uh, the, the asset management in the OT environment. It's, it's been always. Um, a challenge on keeping up to date with all the connected assets off the O. T. Systems and into the O.

T. Environment. And it's really essential to have a full visibility on what we're having. in our environment. So we cannot, we cannot control something that we don't know about. We cannot enforce controls in something that for, for a ghost or something we are totally unaware that it's connected to, to our environment.

So we have to have. And, and nowadays, um, luckily we have the many, many tools, many solutions from vendors keeping track on all connected assets and the, and the status of each of those assets, um, monitoring the, the vulnerabilities and, uh, that are like published internationally. We're in a lucky era. Right?

Um, right now there is like a boom in the in the O. T. Monitoring technology. But years back, everything was done on Excel. I've seen lots off and lots off industrial plants that they're keeping the tracking off the assets on Excel sheet. And most of the time The, those sheets are, are not, um, up to date and they don't know what do they have and what, what is to be monitored.

So that's the, that's something maybe one of the positive things that we are, we are having today. Uh, lots of tools enabled us to, to monitor what we have and to detect actually things that we, we never thought that. those connected to, to our network. Now, the next step about, um, sorry to cut you off, go on.

No, no, that's good. That's good. Carry on. Carry on. Mm hmm. Mm

Now, in the,

hmm.

the end users, um, you know, after the, the, especially the end users who are having like a good maturity level of the, of understanding the, um, Cyber security threats on the OT environment. They start to, I mean, they start to think going the extra mile and they start talking about, you know, having a standalone OT sock.

Previously, some of the, from my knowledge and from what I've seen, Some of the end users, they try to, uh, basically connect the, um, the OT and environment into the existing it sock that they're having. Um, in with some of these companies or end users, they found it. Very difficult because they, um, wanted to have very specific use cases for that environment.

They don't want just to take the, uh, the ready made use cases of the of their I. T. Environment and, um, enable those, uh, correlation rules or use cases or whatsoever and apply them to the O. T. This won't work, and this will be like a source of distraction. For the analysts on for the engineers who are working on the on the typical sock.

Um, it was very privileged off, uh, designing the biggest and the first autosock in the Middle East for one of the very important, uh, end users, oil and gas, uh, end users. And that happened, um, six years back. So It was a very challenging, uh, work. And when I say challenging here, it's not about, um, what technology do we need to, to adopt?

What tools, what solutions? Because tools and solutions are, are available. The trick is how to use those solutions, how to configure those solutions and how to tailor those solutions based on the process and the operation that you're having. And this is the key differentiator. And this is what makes your OT sock a successful sock and Um, as a control center that enables you to monitor everything in your, um, environment and also enables you to respond in a timely manner.

So you will not get distracted.

hmm. And when it comes actually to the OT SOC, you have now opened a big topic, a very big topic. I mean, we know the IT SOC since a decade, many companies that are actually becoming more and more mature. I know that there are many companies and organizations struggling nowadays in setting up an OT SOC. I think in Switzerland, there are just a handful that I have seen that are really operating this in the right way. going into that, I would like to understand between IT and OT incident response. Can you maybe explain it a bit to us that IT incident response is not the same as an OT environment?

Incident response is an incident response, but the consequences sometimes are different between IT and OT. In OT, sometimes if there is a cyber incident in an OT environment, we're talking about maybe severe consequences, maybe You could you imagine that, um, you have, um, a threat or an attack on, on oil and gas, um, platform in the offshore oil and gas platform.

So the first thing will have, you will have a shutdown in that. And then you, you might have an oil spill, so you have environmental, uh, damage. You could have like fatality, um, some. Some fatal consequences on the of the people who are working there. It could cause like, uh, an explosion or, uh, or fire, or if it's in a nuclear power plant, you might have some, uh, radiation.

So.

Okay.

the OT environment should be taken in, um, a very, um, serious, serious way and the planning for, um, responding to the incidents should be taken, um, precisely and and by real experts who are having hands on experience on that specific, um, environment. It's, um, we are, Always talking about keeping the we are always, um, focusing on keeping the operation and the process and the operation is the top priority.

And we need things to be available. And if You go to the, any of the, uh, engineering studies in the, in the control system, they're asking you that I need your system to be available. 99. 99999 percent which means whenever I, whenever I need that, which means that whenever I need the service of that system, I should have it available.

This is the, the, the physical meaning of that one. So, We're talking about a very critical process, and this process, the consequences on any cyber threat or impact on that process could have, like, um, a catastrophic, um, impact. And that's why, uh, maybe it. And the difference comes from, uh, comes from there and the second point on how to recover fast.

So this is very alike between IT and OT that the first step that we need to stop bleeding off of the system and then we, we, uh, try. to search about the root cause. So we need to stop bleeding, recovering as fast as possible. Then we start, um, searching about what and analyzing the root cause of of the problem.

In And maybe in a sequence in the sequence, the sequence is the same thing when we are responding to any incident. But here the question is how impactful and how severe each incident is. That is the tricky part of it.

and I see nowadays actually AI and machine learning are being introduced for anomaly detection in the OT networks. do you believe that those solutions are already viable enough as a security enhancement or does it still need some time?

Well, maybe the vendors will not take will not like the statement. Some of the vendors will not like the statement that I'm going to make here. But, uh, the A. I. Uh, anomaly detection solutions. They're like good tools. That were introduced to to be used in the OT environment, and they are like, and now they're becoming an essential part off any OT sock.

That is, um, that's the design. But we need to be very careful because they we need to understand the baseline. We need to, um, we need to spend some time to make sure that we don't have that much of false positives. Um, a lot off solution providers that They claim that, okay, my, my solution, uh, doesn't generate, um, that's much of false positives, but, uh, what I've seen and what I've experienced by, by myself, that there's still need, um, sometime and as a complimentary solution, they're very good.

But there should be always an experienced human behind the whole process who is able and capable to to analyze the alarms coming from from the system.

And something that I would also like to touch on when it comes to the OT security SOC are the building blocks. Let's assume I'm a client. I'm reaching out to you saying, Hey, Kusai, I think I need an OT SOC. Can you help me in setting this up? would be your approach to this kind of scenario? Would you recommend him to integrate OT SOC into the existing IT SOC?

Would you recommend him to have a separate or to have a hybrid model? Maybe you can share a bit your thought process here.

Well, you mentioned the different, uh. Um, the, the, the different recommendations that we might have, but all depends on how big is, is that, uh, industrial plant. So is it, first of all, because we always, um, we are not always also talking about cybersecurity itself. We are here talking also about business. It's a business case and we.

We need, um, if, if I'm asked this question, I will understand, first of all, how big is that system? Is it feasible to, to build a dedicated OT SOC for that environment or no? So it doesn't make any sense that if you have like a very small, uh, plant or a factory, and then you're talking about you and, uh, a standalone OT SOC, of course, but if you're having, um, a huge.

Um, critical infrastructure on different, for example, oil and gas plants or petrochemical process that that makes that makes sense to focus on Lee on the use cases that you have for your environment. So it depends if it's a small environment and if it's not feasible to have a standalone one, and if the use cases for that specific environment can be Identified and can be integrated within like a hybrid style.

That would be fine. But we need to consider and we need to focus on securing the interface here between the O. T. System and the I. T. Sock, which is something that I always like. Keep repeating that we need to make sure that it's secure. Interface that communication interface is secured.

and when you're talking about the business case, do you have a threshold in mind? Are we talking about 500 assets, thousand assets and more, uh, would you do then multiply it by the amount of sites that they have? Because I would like to understand a bit, the reasoning behind this process to come to a certain conclusion, what might be best for an organization.

Um, for me, I, for me personally, and, and, you know, always my opinions are individual opinions and, you know, does not reflect, of course, to any, uh, any other, uh, Um, affiliation for for me to to any company, but I'm always looking at it from like from a top level. So I don't count for me. I don't like to count the assets and say I have like 500 or 1000 asset.

I see the I look at the operation itself. I look at the process. I look how big is is that plant? For example, if you have a refinery, uh, the refinery itself is is a huge plant. So you can have a sock for that and a dedicated OT sock for that refinery.

Silence. Silence.

maybe you can connect different plants to, uh, together again, for me, it's, there is no threshold of a number of assets, but it depends on how deeply we understand the process, how deeply we understand the operation and.

Um, if how many use cases we could have, so if we have very few use cases, uh, for the audio environment, typical use cases, then maybe it is not visible to have, you know, a dedicated or the sock. But if we're having, like, uh, maybe, um, dozens off of the use cases, no, it makes sense to have a dedicated or the sock.

Plus dedicated expertise who understand, you know, the, those operational and processes use cases. So for me, I look at it from the number of use cases.

hmm. And I mean, with the number of use cases, you just have, um, provided us more visibility into the thought process around setting up an OT SOC. I believe you have seen many clients struggling also in setting this up, not looking at it from the right angle, not having the right business case. Can you explain this a bit?

Um, what are the biggest misconceptions you have seen when it comes to setting up an OT SOC? Uh huh.

biggest misconception that, that I have that, um, they think that the having, uh, an Autisoc with the latest and the greatest technology will solve all the problems or Another misconception that if we automate everything and we're having like those good tools, I don't need some engineers or experts to set behind and to to verify, you know what's being done by by those solutions.

So, and this is really, um, those those Really critical points because if if you don't really understand what you have and what do you want to to monitor and if you don't understand the I'll just give you a simple, simple examples, it's it's. Everything starts from having a proper risk assessment and, and you know that you and having a risk assessment for that environment.

It's not about, um, a checklist or yes or no questions that we fill and say, Oh, okay, fine. This is the risk that we have. And this is available. This is not available. And this is good to have. And this is most it doesn't work in this way. The risk assessment, the proper one, um, for the OT environment starts from understanding how your system works and what are the probabilities of that PLC with a certain inject to go, to go wrong, for example, or if someone changed the, the sequence of operation for a motor or for a valve, what could happen?

So we're not because it's not about, you know, having, um, run somewhere on the engineering workstation. It's about, you know, changing maybe sequence off of operation, uh, masking some of the alarms. Uh, so this these things we need to look at these very fine details in the operation that we have, and we don't want to miss.

Look at it. Just we, you know, most of the, for example, the seams that are available today, they have, they're coming with pre loaded thousands off of use cases. And I've seen that, uh, many end users, they just, you know, have the best technology and they enable everything. So this, this is not applicable. This is not applicable.

If, if that correlation rule or Okay. A use case does not apply on the OT. Why you would, why you, you would, um, enable that? So you need to be very precise in the, in the operation and the process and in the use cases that you have. And of course, every environment or plant is different than the other one.

Thank you very much, Kusai, for sharing this knowledge and perspective when it comes to OT SOC misconceptions. One of the things that we also wanted to discuss about is the future of OT security, emerging trends and technologies. So nowadays we are talking a lot about cloud potentially being the future of industrial control systems. Um, how is your take on that? Do you believe that this is the right path? Or should we still consider to have hybrid models?

Well, for us, we cannot, we cannot resist the evolution of, of the technology, but rather we should, uh, we should always. Um, I mean, from the cyber security perspective, we should always be ahead off of any evolution in the technology proposing and anticipating the challenges time ahead and Providing and being proactive in providing solutions time ahead because we don't want to be, uh, to be behind and we don't want just to react when when there is a problem.

Now it's not about, uh, you know, some, you know, there are like some concepts now. Um, some people talking about, for example, PLC as as a service and in the cloud. Some some, uh, You know, you have a lot of, uh, IOT offers here and there from different, um, providers. We cannot resist the evolution of the technology.

But my, my advice and my, my thoughts that we should be ahead of time. We should go hand in hand with the, with the latest technology and try to be. Proactive rather than being reactive and just acting when there is a threat, we should have a visit. We should have, you know, um, we should visualize and anticipate the proper threat model that, um, and then, you know, acting accordingly.

Thank you very much also for that, Kusai. We're also approaching the end of this episode. I would be curious if there is any anecdote from your career that you want to share with the audience was a aha moment or that has really shaped the way you have done your career?

So the only advice that, um, I always have for, for the, for maybe for the young, uh, I'm still young, but for the young generation is, is that, um, two things, try to have hands on experience. Whenever that is, um, possible, um, seeing something on, on, on, uh, simulation, software, reading standards, controls, frameworks.

This is not enough. If you don't have the proper hands on experience, you cannot be a good expert in the, in the future. You can, of course, you can work, you can develop your career, but you will have something missing and you will feel that there is something will will be missing in the future. If you don't have that hands on experience.

And the second one is just, um, keep up to date with the latest, uh, technology. At the moment, you feel that, okay, I have It's very wrong that an engineer or someone working in the technology says, Yeah, yeah, I have all the knowledge. No one has all the knowledge. And, um, you know, the, the, the evolution of the technology and the, and the knowledge never stops.

So at the moment you stop learning, you're out of the competition and, um, you will not be alive. I mean, Having that, you know, feeding yourself with the science technology and and and understanding what's going on around. This is what, um, keeps us alive. The moment we stop, we're just out of that competition.

People on this episode

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

The Vergecast Artwork

The Vergecast

The Verge