System Integration
Calgary OSIsoft PI Experts and Calgary OSIsoft AF Experts. This is a deep area; we begin by differentiating data integration from application integration. Data integration is an essential need for all organizations so that data in their disparate data sources does not remain siloed. There are three main forms of data integration used by organizations: replication (copying the data as-is), consolidation (transforming source data into a condensed format for shared access), and virtualization (keeping the data in place but allowing access from other places).
Application integration is required when two or more applications must share information to enable or facilitate a certain business process. Both forms of integration may use similar techniques and software, but the intent of application integration is different than that of data integration. When we think of transactional data processing, we are thinking of application integration. Whereas when we think of moving large amounts of data for historical data analysis, we are talking about data integration.
Traditionally, the functionality of an application has been self-contained. However, as the scale of data grows (due to many trends in the industry, such as big data and IOT, and as companies try to take advantage of real-time data analytics, the singleton application design is evolving into distributed data integration patterns. We separately discuss this subject below.
From an operational data standpoint, organizations require both data integration and application integration. Oftentimes, due to data security and performance reasons, companies prefer to migrate data to another data store rather than allowing direct access to the historian (preferring data consolidation or replication as opposed to virtualization) although this is not always the case. As more sensor data is collected, companies are also increasingly looking at big data frameworks especially for the ability to scale streaming data, and to leverage cloud-based analytics.
Data Integration Tools
Data integration tools aim to either consolidate data or replicate data. They may connect to a number of source systems, perform a number of transformations on the data and load the final data to several destinations. Below are some of the commonly used tools and software for data integration.
Informatica Power Center
Informatica Power Center is a popular ETL (extract, transform and load) tool that is used to connect to numerous data sources such as relational databases, flat files, ERP systems, and web services.
SQL Server Integration Services
SSIS, like Informatica's Power Center is an enterprise-level ETL platform that facilitates data consolidation design patterns. It is often used to feed data warehouses in order to enable data analytics and business intelligence.
Qlik Replicate
Qlik Replicate is a tool that helps with replication of data using its continuous change detection mechanism. It is able to capture transactional, batch, or streaming data and can output the results to a large catalog of destination endpoints.
Azure Data Factory
Azure Data Factory (ADF) is the cloud-based ETL and data integration service from Microsoft that allows you to create data-driven workflows for orchestrating data movement and transforming data at scale. It is a serverless cloud-native solution and can integrate with a large number of cloud-based or on-premises endpoints.
AWS Glue
Like ADF, AWS Glue is a fully managed ETL service from Amazon. AWS Glue also allows connectivity to numerous data sources and has an ETL engine that generates Python or Scala code, in the background.
Upsolver
Upsolver offers a no-code data lake engineering platform designed for cloud analytics. Upsolver has connectors to a number of cloud and on-premises endpoints. Upsolver is deployed in the AWS cloud on EC2 instances.
Application Integration Tools
These tools and platforms help organizations with application to application integration. This is achieved through various flavours of tools, including enterprise service bus (ESB), application programming interface (API) gateways, or distributed messaging frameworks. Note that this is not an exhaustive list of tools and there are many other popular tools in the marketplace.
Oracle Service Bus
Oracle Enterprise Service Bus (Oracle ESB), a key component of Oracle's SOA suite of products. It contains both connectivity, transformative, management and monitoring functions. Oracle ESB connects to a number of on-premise data sources such as file systems, ERPs, email servers, web services, and several others.
Azure Logic Apps
Azure Logic Apps is a cloud-based platform for creating and running automated workflows that integrate applications, data, services, and systems. The platform allows serverless workflow development and leverages Azure's integration services platform to connect applications together to automate business processes.
WebMethods
Software AG's webMethods is an integration and API Management platform. It runs in the cloud, on premises, or in hybrid or multi-cloud environments. It contains an ESB, an API gateway, as well as a business to business (B2B) document management system.
MuleSoft AnyPoint
MuleSoft's AnyPoint Platform is a solution for iiPaaS (integration platform as a service) and a API management. Owned by SalesForce, it contains a number of management and security features as a form of information governance as well.
Tibco Cloud Integration
Tibco's Cloud Integration solution is a no-code, iPaas platform that provides self-serve integration capabilities with cloud-native integration flows. It can connect to numerous on-premises and cloud endpoints.
AWS Step Functions
AWS Step Functions is a low-code, visual workflow service that allows developers to build distributed applications. Similar to Azure Data Factory, it helps enable business processes automation through data and machine learning pipelines. using AWS services.
Big Data Frameworks
One category of tools we have not yet discussed is big data frameworks. The aforementioned tools in the application and data integration sections can handle real-time data updates but the following tools have been architected especially for massive scale distributed processing. For this reason, one component does not necessarily do all the heavy lifting in an integration solution, as the responsibility is split by function.
These frameworks can be suitable for both application or data integration, but the tools in this section are meant to handle streaming data. For this reason, examples in this section are product stacks as opposed to individual products. Examples of data integration frameworks include Azure Databricks, Azure HDInsight, Apache Hadoop, Apache Spark, and Amazon Elastic MapReduce for instance. Examples of application integration solutions include Azure Service Bus, Amazon MQ, Apache Kafka, and HiveMQ.
How Can We Help?
Here are some ways in which we can help. This is not an exhaustive list - contact us today to discuss your needs to see how we can help!
Architecture Design
We will help design your integration architecture and help you achieve your data and application integration goals. The designs will ensure security, scalability, performance, and will follow general integration best practices.
Implementation
We will help you build your integration solutions. Our team has a lot of experience with ETL / ELT solutions, as well as the application integration solutions noted above. We also have strong experience in connecting to OT data sources like historian systems and control systems.
Advisory
With the dizzying array of new tools, techniques, and platforms with which to integrate data, clients are often left wondering where to start. We can help guide you and develop an integration roadmap to achieve your short and long term strategic goals.
Custom Development
Where there are gaps with existing solutions, our team can write some customized code or drivers to help with connectivity or processing related functions.
Managed Support
We provide managed support and staff augmentation services for your integration solution. We provide flexible support packages including business hours support, 24x7 support, or weekend support. We have a large pool of skilled resources who can help support your systems.
Optimization
We can help optimize your integration solutions by helping you identify areas where you can reduce maintenance, improve performance, and improve scalability. We will help you streamline your data ingestion and transformation pipelines and can even help recommend or review newer tools in the marketplace that may service your needs better than your existing toolset(s).