How To Buy
EN
TR

An outage can start as ordinarily as a single relay falling silent in the field. Then alarms rain down, data flow stops, and decision-making is delayed. In sectors like energy, water, wastewater, oil and gas, and transportation, this delay translates into real-world costs. This is precisely where redundancy comes into play. An architecture using dual SCADA servers, dual RTUs, and dual SIMs maintains data and control with hot standby, failover, and high availability. The goal is clear: we do not want operations to stop, we want zero data loss, and the disaster recovery plan to be ready at all times.

In this article, we address the fundamental building blocks of this architecture and practical implementation points in a simple and focused manner. We explain the path to establishing a culture of high availability, from the field to the center, and from hardware to connection.


Why is High Availability Critical? Risks, Costs, and Goals

Downtime in a SCADA environment is not just the screen going dark. A pump might stop, a valve might remain in the wrong position, or alarms might be delayed. This means production loss, risk of environmental discharge, security vulnerabilities, and regulatory violations. Scenarios such as loss of pressure in water distribution, uncontrolled switching in a substation, or a leak detected late in a pipeline can escalate quickly.

Uptime percentage simply illustrates the business impact. The difference between 99.95% and 99.999% translates to minutes, not days, over the course of a year.

Uptime Percentage Estimated Annual Downtime
99.95% Approx. 4 hours 22 minutes
99.99% Approx. 52 minutes
99.999% Approx. 5 minutes

RTO (Recovery Time Objective) is the maximum acceptable recovery time for the system. RPO (Recovery Point Objective) is the maximum acceptable data loss interval. In the context of SCADA, RTO should be expressed in minutes, and RPO in seconds. This is because alarm and event data must be timestamped and remain consistent during retrieval. Security is a separate topic; especially in critical infrastructures, incorrect data is as dangerous as an incorrect command. Environmental and compliance risks are managed with reportable data and archive continuity.

Therefore, the goals must be clear: high availability, reliable failover, consistent data, and rapid disaster recovery. Redundancy is not an option; it is the foundation of sustainable operation.


Dual SCADA Server: Hot Standby and Fast Failover Design

Consider two SCADA servers: one Active, the other Hot Standby. The Active server runs, handles all sessions, and generates alarms. The Standby server keeps the same database, alarm status, and user sessions synchronized. If the Active server fails, failover occurs automatically and quickly. The goal is for the operator to continue their work without noticing the change.

Critical components of this design:

  • Heartbeat: Servers regularly check each other. Packet loss, latency, and threshold values are well-tuned.

  • Quorum: Decisions in a multi-node setup are made by majority. This prevents unilateral decisions.

  • Split-brain prevention: Prevents two active servers in the event of a network partition. A witness node or tie-breaker is used.

  • Database replication: Data remains up-to-date with synchronous or semi-synchronous replication. The RPO target is determinant here.

  • Session and alarm sync: Operator screens, alarm flow, and acknowledgment information must remain consistent.

To mitigate risks during testing and maintenance:

  • Conduct planned failover drills, and combine the observation with an automated report.

  • Apply software updates in a phased manner, starting with the standby, then the active server.

  • Regularly run backup and restore scenarios.

On the security side, role-based access, multi-factor authentication, and network segmentation are basic needs. HA licensing model and activation rules for the dual node must be clarified beforehand in license management.

To quickly recall SCADA and RTU concepts, this guide offers a useful summary: What is an RTU and how does it work with SCADA.


Field Security with Dual RTU: Reliable I/O and Control

RTU devices are the heart of field control. In a redundant RTU architecture, two devices can share the same I/O. One performs active control, and the other monitors and remains synchronized. If the active device fails, the second device takes over control without interruption.

How it works:

  • I/O sharing can be active-passive or active-active. Active-passive is preferred in most SCADA environments.

  • The primary RTU is selected as the leader during commissioning. The secondary RTU is synchronized in passive mode.

  • Fault is detected by a watchdog signal, communication loss, or power drop.

  • Time synchronization is fixed with NTP or GPS. Event and trend data are maintained with accurate timestamps.

  • Protocol support is important. Protocols like IEC 60870-5-104, DNP3, Modbus TCP, and MQTT are selected for both central connection and inter-station communication. Devices with appropriate class for environmental resilience, temperature, EMC, and vibration conditions should be preferred.

Good practices for power redundancy and field cabling:

  • Use dual power lines and an external UPS.

  • Perform segregation in I/O cables; route input and output groups to separate channels.

  • Adhere to line termination and shielding rules.

  • Define a safe shutdown scenario with watchdog relays.

For those who want to examine RTU examples that support redundancy, two different product families offer a good reference: DM100 RTU redundant SCADA solution and DM500 RTU with redundant CPU modules. This document provides a practical resource for detailed programming and protocol blocks: Mikrodev DCS programming guide.


Dual SIM and Multiple Connections: Seamless Data Communication

Dual SIM makes a big difference in sites relying on cellular infrastructure. Two operators, one goal: connection continues without interruption. The basic logic is to use the primary line as long as it is healthy, and automatically switch to the secondary upon detecting a problem.

Practical settings:

  • Switchover rules: Trigger the switch with signal level, packet loss, RTT threshold, and the number of consecutive errors.

  • Data quota: Monitor the monthly limit, and define the rule for activating the replacement line.

  • Health check: Perform a test to the actual endpoint with Keepalive and periodic ping.

Alternative path options:

  • Ethernet or fiber can be used as the primary path if feasible in the field.

  • Industrial radio links provide low-latency backup connections over short distances.

  • MPLS or SD-WAN solutions offer intelligent routing with central policies.

Security topics:

  • Private APN provides isolation in the cellular network.

  • VPN tunnel protects data with encryption and authentication.

  • Certificate management and device identity prevent unauthorized access.

Hot standby and failover concepts are not only for the server; they are also applied at the network layer.


Disaster Recovery Plan and Continuous Improvement

The disaster recovery plan is not a single document; it is a living process. But it can be managed with simple steps.

  • Determine goals: Define RTO and RPO values based on business impact. RPO can be seconds for critical alarms, and minutes for reporting.

  • Backup strategy: Use a combination of full, incremental, and continuous backups. Keep backups offline and geographically separated.

  • Switchover to the secondary center: Write down step-by-step in the Runbook. Include DNS, connection tunnels, SCADA license migration, operator access, and rollback plan.

  • Drills: Supplement planned drills with surprise tests. Measure results, and record RTO and RPO deviations.

  • Observation and root cause analysis: Generate permanent corrective actions after an incident. Avoid repeating errors with configuration management and versioning.

  • Documentation and training: Prepare short, visual, and role-based guides for operator, maintenance, and network teams. Avoid knowledge loss when personnel changes.

  • Change management: Every patch, device replacement, or architectural update must pass through impact analysis. Approval and a rollback plan are mandatory.

This cycle strengthens the redundancy culture. High availability is sustainable not just with equipment, but with process.


Conclusion

When dual SCADA servers, dual RTUs, and dual SIMs are used together, a backbone is established that maintains control and data from the field to the center. Hot standby, failover, high availability, and disaster recovery disciplines should be considered under one roof. Take action now: clarify your goals, rank risks, test with a small pilot, and then gradually expand. Plan a controlled and measurable journey, not a problem-free one. If you have a scenario you would like to share, leave it as a comment, and let’s clarify it together.

Other Post
All Posts
What Is a Remote Terminal Unit (RTU)? Its Fundamental Role in Field Automation
What Is a Remote Terminal Unit (RTU)? Its Fundamental Role in Field Automation
Imagine a water reservoir located outside the city. The water level inside drops rapidly at midnight, the pump needs to start, but there is no one on site. If it takes hours for a technician to arrive
Read More
Gaziantep City Water and Sewerage Administration (GASKI) Sludge Incineration Plant
Gaziantep City Water and Sewerage Administration (GASKI) Sludge Incineration Plant
Mikrodev PLC products were preferred in the follow-up and control of the processes of Türkiye's first sludge incineration plant in Gaziantep. The process is carried out at 850 to 1100 degrees Celsius.
Read More
Georgian Oil & Gas Corporation (GOGC) Rustavi RMS System
Georgian Oil & Gas Corporation (GOGC) Rustavi RMS System
RTU300 series remote terminal unit products and MBS100 series MODBUS gateway product were used in the RMS system commissioned in Georgia. Energy data is monitored over the ViewPLUS SCADA system over M
Read More
What are Industrial Automation Products?
What are Industrial Automation Products?
What are Industrial Automation Products? Industrial automation involves using control systems, computers, and robots to automate an enterprise's production processes; and industrial automation produc
Read More
Tekirdag City Water and Sewerage Administration (TESKI) Water Distribution SCADA System
Tekirdag City Water and Sewerage Administration (TESKI) Water Distribution SCADA System
Mikrodev RTU300 series remote terminal unit products are used at more than 350 stations in the drinking water distribution system in Tekirdag, Türkiye. The data at each point is sent to the central SC
Read More
What is SCADA with Example?
What is SCADA with Example?
In the intricate world of industrial automation and control, SCADA emerges as a transformative force. It serves as the backbone for streamlining monitoring and management across a myriad of sectors, f
Read More
HOW TO REDUCE MAİNTENANCE COSTS WİTH AGİNG FUNCTİONS İN PLC PROGRAMMİNG ?
HOW TO REDUCE MAİNTENANCE COSTS WİTH AGİNG FUNCTİONS İN PLC PROGRAMMİNG ?
Optimizing Control Systems with PLC, RTU, and SCADA In the ever-evolving world of industrial automation, Programmable Logic Controllers (PLCs) play a crucial role in controlling and monitoring variou
Read More
IoT Gateway Selection: Features, Performance Criteria, Protocol Support, and Device Capacity
IoT Gateway Selection: Features, Performance Criteria, Protocol Support, and Device Capacity
In industrial or commercial IoT projects, IoT gateways play a crucial role in ensuring accurate and fast data flow between devices. An IoT gateway brings together different protocols, devices, and app
Read More
Alarm Management ISA-18.2: SCADA Alarms, Event Management, Priority, and Operator Effectiveness
Alarm Management ISA-18.2: SCADA Alarms, Event Management, Priority, and Operator Effectiveness
Imagine 3,000 alarms sounding in one shift. The operator tries to pick out the real danger within the noise, stress increases, and errors multiply. A critical pump stops a few minutes later, and the c
Read More
Redundancy Design: Seamless Operation with Dual SCADA Servers, Dual RTUs, and Dual SIM
Redundancy Design: Seamless Operation with Dual SCADA Servers, Dual RTUs, and Dual SIM
An outage can start as ordinarily as a single relay falling silent in the field. Then alarms rain down, data flow stops, and decision-making is delayed. In sectors like energy, water, wastewater, oil
Read More
CATALOG