All kinds of industries are adopting and benefitting from digital twins. In manufacturing, digital twins are a simple monitoring solution to track production and can replicate the entire physical shop floor. In the transportation industry, digital twins can replicate trains, process real-time GPS positions of their current position, and detect anomalies by measuring several physical parts. In the healthcare industry, digital twins can replicate patients, understand their conditions, and simulate the effect of certain medications. In building management, digital twins can optimize energy allocation and reduce electricity and gas usage.
Digital twins are a great way to try out new production capacities before launch, detect anomalies in every production phase, increase end product quality, and—with the latest artificial intelligence/machine learning (AI/ML) trends—combine advanced data science capabilities with real-time data to determine when maintenance should happen and prescribe actions.
But, what organizations often don’t consider enough is the exponential growth of generated data in terms of volume, variety, and the number of sources—from increasing data sources such as the Internet of Things (IoT) devices and event streaming. According to Forbes, “From 2010 to 2020, the amount of data created, captured, copied, and consumed in the world increased from 1.2 trillion gigabytes to 59 trillion gigabytes, an almost 5,000 percent growth.”
But why is more data an accuracy risk factor for digital twins?
It’s all about data quality
Unfortunately, with growing data volumes, data quality tends to decrease. Gartner estimates that the impact due to inaccurate, missing, or duplicated data can cost organizations an average of $12.9 million annually. Apart from an immediate revenue impact, poor-quality data can increase the complexity of data management and compromise trust when making business decisions.
Digital twins are becoming more pervasive in monitoring, suggesting actions, or simulating unknown future conditions. When digital twin outputs are influenced by inaccurate or missing data, the entire concept of the virtual copy may contribute to inaccurate decision-making:
- Manufacturing: A digital twin incorrectly identifies an anomaly that needs to be fixed. Maintenance is conducted on parts that actually don’t need to be replaced, such as replacing a pump more often than needed. Or worse, the digital twin suggests a configuration change that may result in overheating.
- Healthcare: A patient is flagged as having stroke factors incorrectly, based on a predictive model created with poor-quality data.
- Transportation: A shipment digital twin suggests a non-optimal route due to inconsistent data from multiple data sources.
Creating a better twin
How can you avoid inaccurate digital twins created with bad data? A central component of modern data architectures is focusing on data quality. Many platforms can increase data quality, creating trust in data and preventing inaccurate data.
In a data architecture, increasing data quality is a five-step, iterative process:
- Integrate data sources from various systems with data virtualization, ETL, or from real-time sources
- Profile data to discover and analyze where source data needs to be fixed or improved
- Proceed with manual data remediation to fix issues from previous steps
- Automate data cleansing and deduplication based on models and rules
- Monitor data over time and provide KPIs to understand data trends
By creating a so-called data quality firewall, you can provide real-time error detection and correction to protect your digital twins from inaccurate data. Data can be cleansed and corrected in real-time to ensure it receives only high-quality data from any type of data source.
With better and trusted data, digital twins are more accurate and reliable. Data quality is a must, especially when AI/ML models are trained and used for integral decision-making. Decisions may impact a patient’s life, a manufacturing process, or train maintenance—so there’s no room to neglect data quality.
Are you making bad digital twins?
Worried that your data is creating inaccurate digital twins? Spotfire is here to help. Contact us to learn more about how you can streamline digital twins.