A Digital Twin is a dynamic, virtual representation of its physical counterpart, usually across multiple stages of its lifecycle. It uses real-world data combined with engineering, simulation or machine learning models to enhance operations and support human decision making. Digital Twin mirrors a unique physical object, process, organization, person or other abstraction and can be used to answer what-if questions, present insights in an intuitive way and provide a way to interact with the physical object/twin.
The key elements in a Digital Twin:
While there are many different views of what a digital twin is and is not, the following more or less lists the common set of characteristics that is present in nearly every digital twin:
- Connectivity created by IoT sensors on the physical product to obtain data and integrate through various technologies. Alternatively, capture on field information of assets through drone photography or LIDAR scans followed by 3D reconstruction using different techniques
- Digital Thread, a key enabler interconnecting all relevant systems and functional processes homogenization, decouples the information from its physical form
- Re-programmable and smart, enabling a physical product to be reprogramed manually and in an automatic manner
- Digital traces and modularity, to diagnose the source of a problem
A Digital Twin is not a single technology play, rather it is realized through an amalgamation of multiple technologies such as:
- Visualization, AR/ VR: Usually the topmost layer in a digital twin that combines the data and insights to present, advise and interact with the user or other machines
- Workflow and APIs: Extract and share data from multiple sources in creating the digital twin and/ or infuses the insights within workflow of digital twin
- Artificial Intelligence/ analytics: Using machine learning framework and analytics to make real- time decision based on historical and streaming data
- Knowledge graph: Creates a digital thread based on semantic model, data dictionary and knowledge graph
- Internet of Things and Data Platform: Real-time data ingestion, gathered via sensors and gateways from physical asset/ objects related to state, conditions and events. And to integrate, persist, transform and govern the data collected
- Digital infrastructure: Hybrid infrastructure including cloud, edge compute, in-plant infrastructure, etc.
- Physical infrastructure: Instrumentation of physical objects via sensors, gateways, network (IT/ OT), etc.
Emerging architecture of Digital Twin, an IBM point of view:
Architecture for digital twin essentially comprises of all the elements as described in the above section. Objective is to be able to support the functionality to connect, monitor, predict and simulate multiple physical objects, assets and/ or processes.
Figure 1: Digital Twin Logical Architecture
Building Digital Twin with AWS technology stack:
Propose here is a digital twin that is primarily based on the AWS software stack:
Figure 2: Digital Twin Technology Architecture with AWS
Architecture Layer | Technology | Vendor | Purpose |
Data Platform: Ingestion | Kinesis Data Streams, Kinesis Data Firehose, IoT Core | AWS | Ingesting and processing device telemetry securely is one of the key requirements in Digital twins. A combination of IoT Core and Kinesis will allow connecting to heterogenous data sources, multiple protocol and support streaming ingestion |
Data Platform: Persistence | DynamoDB, RDS/ Aurora, S3, Timestream, Redshift | AWS | Considering the diverse nature of data that will be handled by Digital Twin, polyglot storage is recommended. Thus a combination of relational, non-relational, time-series, data lake and warehouse is required |
Data Platform: Integration, Transformation & Quality | Glue, Lambda | AWS | Meant to provide consistency of data through the validation, enrichment, cataloguing and transformation conforming to the standard and integrate with the other systems |
Digital Thread: Knowledge Graph | KITT | IBM | The Knowledge Graph represents a collection of interlinked descriptions of entities – objects, events or concepts. It will put data in context via linking and semantic metadata. The IBM asset KITT is a general purpose knowledge graph that enables the Digital Thread required to link lifecycle information and data together. Kitt is proposed to be deployed on Red Hat OpenShift |
Modelling & Execution: Analytics, Machine Learning | SageMaker, Athena, Kinesis Data Analytics, EMR | AWS | Cloud based Machine Learning Platform for building, training and deploying models based on the data collected via data platform. Also provides ability to perform analytics on streaming data. We have proposed here interactive query service that makes it easy to analyse data from e.g., S3 using standard SQL. Further to process the mass of data and analyse the same we also propose AWS EMR |
Consumption & Visualization: Dashboard, Apps | Quicksight, Fargate | AWS | For portal, real-time dashboards and command centres we have proposed here Quicksight for BI service and Fargate to build web apps |
Consumption & Visualization: AR/VR/3D/2D | Sumerian | AWS | AR and VR services are required to visualize diagnostics, predictions and recommendations for physical world. An extension of dashboard visualization is also required in digital twin to provide a view in 3D/2D |
Consumption & Visualization: API & Microservices | API Gateway, Node JS, SpringBoot, | AWS | To provide secured access to application APIs and batch files we propose Microservice based applications built using AWS API Gateway, Node JS/ SpringBoot and exposed via API Gateway |
Workflow Mgmt.: Intelligent Workflow | Lambda, AppSync, Simple Workflow Service, Airflow, RedHat Process Automation | AWS, Apache, RedHat | Workflow Management in digital twin is intended to deal with the business processes, simulation and event based flows. Besides the AWS stack we also propose RedHat Process Automation and Apache Airflow either of which could be hosted in the RHOS on AWS |
Governance & Operations: DataOps | Glue | AWS | DataOps is an essential element of Digital Twin that sits atop the big data which must be available on time, be automated and managed well to extract value. AWS Glue with its end-end data capability to discover, prepare, and combine data for analytics, machine learning is the right fit for the purpose |
Governance & Operations: DevOps | CloudFormation, CodeBuild, CodePipeline, CodeDeploy, CodeStar | AWS | AWS CodePipeline helps to build a continuous integration or continuous delivery workflow that uses AWS CodeBuild, AWS CodeDeploy, and other tools, or use each service separately. With AWS CodeStar the entire continuous delivery chain could be set up for a scalable DevOps solution |
Governance & Operations: MLOps | SageMaker | AWS | AWS SageMaker as MLOps fulfils the AI@Scale goals providing capability to build, train, deploy and maintain machine learning models in production reliably and efficiently. |
Governance & Operations: Data Governance | Collibra | Collibra on AWS | The Collibra Data Governance and Catalog solutions will help find, understand, and trust the data, ensuring quality, and accessibility in a digital twin |
Hybrid Infrastructure: Edge & Cloud | Greengrass, Elastic Kubernetes Service, Elastic Compute Cloud, RedHat OpenShift on AWS | AWS, RedHat | Hybrid infrastructure that transcends beyond cloud is a reality in digital. However the core components will be on cloud with edge as a key component residing outside. EKS and/ or ROSA based containers are the ideal choice for the non-serverless component. Those requiring VM kind of infrastructure can be catered via EC2 instances |
Security and Monitoring: | ECR, IAM, Secrets Manager, CloudWatch | AWS | Digital Twin has multiple aspects of security that needs to be catered for including identity management, information protection, managing infra secrets, monitoring etc. |
Conclusion:
Digital Twin is intended to connect the digital and physical world as it truly takes the IoT, machine learning and virtual reality, in tandem to the next level. Read more about digital twin here: – https://www.ibm.com/topics/what-is-a-digital-twin. Creating an end-to-end digital twin platform, requires lot more than single set of capabilities with heterogenous software and hardware stack, multiple set of architecture with the principal theme being data, read more about Data architecture here – https://www.ibm.com/in-en/analytics/data-fabric. There are specialized software vendors for each layer or architecture, the flexibility comes from adopting a hybrid approach wherein the hyperscalers forms the core part of the solution. While the individual customer environment will determine hyperscalers, the advantage of AWS with its IoT centric platform and strong data products can be leveraged to build powerful digital twin solutions, refer to AWS Architecture Center for the diverse architecture knowledgebase, https://aws.amazon.com/architecture/.