In recent years, Artificial Intelligence (AI) has increasingly changed how businesses operate across all sectors, especially Finance and Insurance, Professional Services, and IT according to CBInsights.
Identifying AI initiatives within a business is a joint effort between the line of business, engineering, and data science teams. In order for a company to benefit from AI, they must identify their business objectives and identify any financial constraints. Without a solid foundation and direction with a strong chain of communication between the business and technical teams, yielding positive returns will be challenging.
The benefits of AI can be seen in the success of the companies who were AI pioneers like Google, Facebook, and Amazon. Now, as the industry continues to evolve with new technologies and solutions, AI has become more accessible. This has created opportunities for companies of all sizes to leverage data to drive business decisions. However, the enterprise adoption of AI is still in its early stages, and many companies face high failure rates when it comes to Machine Learning (ML). According to ESI Thought Lab, only 20% of AI projects are in widespread deployment. Additionally, AI is not a “one-size-fits-all” solution, and 40% of AI projects create negative or no returns.
To uncover why this is happening, it is crucial to have a concrete understanding of the AI ecosystem: components and key players.
We understand that the field of AI is a constantly and quickly evolving field and therefore the ecosystem as well. The purpose of this article is to show an overview of what the AI ecosystem looks like in 2020 in order to point out gaps and challenges with the enablement of AI and determine the main factors for the slow adoption rates of ML.
In order to discuss the AI ecosystem, it is important to first understand the path to operationalize AI and the roles of the different components along that path.
Although we will discuss the path as distinct components, we understand that there is some overlap between them and many companies offer services or technology that may fall under multiple levels and some that cover every category (Microsoft, Amazon, Google, etc). Additionally, there are some companies that offer end-to-end AI solutions or “AI Consulting”.
Companies that are considered “AI Consultants” act as a combination of the components in the AI ecosystem, offering personalized AI solutions, from raw data to business impacts. They also tend to have a few staff that can support companies in identifying AI use cases. They deliver enterprise AI in a highly scalable, powerful, and collaborative environment and can serve multiple clients with relatively small teams focusing on the business side of AI.
Moving on to the individual components along the path to AI operationalization, the following information provides a 4-step landscape of companies providing services or platforms at different levels of AI maturity.
1. Data Collection and Storage
The first part of this journey involves data collection and storage. The quality of data is an important factor in the AI process and the success of machine learning projects, that is why data collection and storage is the foundation of AI enablement. Data can be collected from different sources with companies like LionBridge or with sensors through IOT companies. Storage is an integral part of the AI ecosystem because it enables AI on a larger scale by delivering computation, memory, and communication bandwidth.
2. Extract, Transform, Load (ETL)
The next segment of the roadmap deals with data extraction, transformation, and loading. Specifically, data is extracted from data sources not optimized for analytics and moved to a central host. It’s use cases include cloud migration, data warehousing, data integration, and machine learning/ AI. This component is important to the enablement of AI because it helps maintain the integrity of data to use for modeling and business intelligence.
3. Cloud Dataflow
Following the ETL, the data is ready for the models. The next step is to effectively move the data to the cloud. Key features for these companies include moving data to the cloud and connecting complex systems. Additionally, you can stream data such as video, audio, application logs, website clickstreams, and IoT telemetry to analyze while it’s still in motion before it is stored, so you can take immediate action on what’s relevant, and ignore what isn’t.
4. AI Processing
At this final point on the path to AI operationalization, companies provide a few services and tools. Hyperscalers host the architecture for companies to scale appropriately as increased demand is added to the system. Cloud platforms provide a framework designed to give organizations faster, more efficient, more effective collaboration with Data Scientists and staff. They can also help minimize costs in a number of ways — preventing duplication of effort, automating simple tasks, and eliminating some expensive activities, such as copying or extracting data. Cloud platforms can also provide Data Governance, ensuring the use of best practices by a team of AI scientists and ML engineers. And it can aid in ensuring that work is distributed more evenly and completed more quickly. Additionally, cloud technologies create cloud compatible services for analysis like natural language processing (NLP), computer vision, and machine learning (ML).
Now that we have an understanding of the components of AI operationalization, let’s take a look at the AI Ecosystem. Below we have identified who the key players are and we take a look at how these companies are contributing to the journey to AI.
Accenture offers an “Applied Intelligence” service, leveraging AI to maximize existing investments and bring new technologies to increase business efficiency and productivity. They create end-to-end personalized solutions for clients including but not limited to: designing, building and deploying AI models, data analytics, data automation, and enterprise data management.
Problem: Sun Chemical needed a solution to allow them to better manage spending and transition purchasing functions to the cloud. They wanted to align with industry best practices as well as deploy new functionalities to facilitate data transfer between enterprise resource planning (ERP) systems and Sun Chemical’s SAP Ariba Buying and Invoice System. Sun Chemical also wanted to find a way to avoid the re-implementation of new integration toolkits by moving all the components of integration to the cloud.
Outcome: Accenture had the solution. They helped Sun Chemical move its purchasing functions to the cloud, allowing the company to optimize spending and gain access to new functionalities like guided buying and spot buying with established sellers and marketplaces. Additionally, Accenture created a custom quarterly report for Sun Chemical’s buying and invoice system. Updates of their systems allowed Sun Chemical to stay current with their technologies with a more user-friendly interface powered by buy recommendation pipelines.
“We make change work for you and your business” — Accenture
1. Data Collection and Storage
Sensory is a company that develops technology for face and speech recognition, NLP, computer vision among others. For example, they provide manufacturers with the ability to provide customers with fitness accessories and wireless headphones using intelligent voice user interfaces.
Problem: Although voice control has become widespread in the use of smartphone and computer virtual assistance, smaller appliances like microwaves lack this capability. Consumers do not have access to devices like this with accurate speech recognition and no latency, especially without an internet connection to maintain privacy. Additionally, most current devices that are similar lack of flexible natural language to account for differences in phrases.
Outcome: Sensory’s powers customers like Spotify, LG, and Logitech through their Cloud-free, “always listening AI”. The technology provides voice-based interactions with products at home like microwaves, refrigerators, and thermostats as well as smart devices on the go. For example, voice-controlled thermostats and real-time alerts from things like home security alarms, knocking, babies crying, and more, all with entirely customizable sound profiles for specific use cases. Sensory offers consumers a secure, convenient, and flexible way to interact with small devices at home or mobily.
“AI on the edge!” — Sensory
2. Extract, Transform, Load (ETL)
Paxata is a self-service data preparation platform that allows businesses to combine, clean, and shape data to prepare for scalable AI. They support data ingestion from a variety of sources including enterprise no sequel and relation databases cloud applications and web-based applications like Salesforce. Paxata also offers integration with analytical tools and BI dashboards as well as end-to-end data lineage and traceability.
Problem: The advancements in the field of medical science have created an exponential growth of data; genomics, clinical, patient, and others. Many studies require teams of research and data from multiple sources. This integration is traditionally done with Excel or Perl and is very time-consuming. Additionally, many bioinformatics or researchers don’t possess the programming expertise to carry out these tasks in a time-effective way, leading to months of cycling time for genome clinical studies.
Outcome: Precision Profile is a bioinformatics company focused on the analysis of genomic profiles to develop treatment plans. Paxata offers the infrastructure for programming and ETL that the majority of researchers in the industry lack. Their solution enabled Precision Profile researchers to accelerate the research cycle, directly compare large cohorts of patients, and incorporate third party references to optimize treatment plans. By empowering oncologists to leverage their data and make it useful, Paxata has been proven to reduce genome clinical studies from 1–3 months to 2–8 hours.
“Smarter ML Models through faster, more accurate data prep.” — Paxata
3. Cloud Data Flow
Pandio helps companies leverage their valuable data and make it usable for AI models. Customers are able to break down data silos, connect complex systems, build a data lake, migrate data warehouses, and ultimately, enable machine learning. Overall, Pandio enables data-driven decision making across the enterprise and accelerates the adoption of AI.
Problem: Billion-dollar media company post three major acquisitions needed to unify customer data and product fulfillment to provide a comprehensive user experience and optimized ad placement and results across hundreds of media options, dozens of systems, and large outside partners and vendor systems. Their new envisioned machine-driven solution was stuck in neutral with massive spending and little movement.
Outcome: With Pandio the customer worked collaboratively to outline their needs and focus on desired outcomes. Our technology-enabled full integration of legacy and siloed systems as well as multiple vendors and partners. Instead of needing an operating team of over 100 they only rely on a high valued team of ~25 employees. It’s fully operationalizing ML from customer engagement and estimates to product fulfillment, measurements, and results reporting. With this product and subsequent growth, the customer was able to go through a recent IPO.
“Enabling a data-driven future.” — Pandio
4. AI Processing
H2O.ai is an all-encompassing open-source data science and machine learning platform. They offer products and services like “driverless AI” for automatic ML, personalized AI apps, enterprise support, as well as deployment, management, and governance of models in production.
Problem: Every company loses customers at some point, but could some of this loss have been prevented? Many companies do not have the data or infrastructure to see that things have gone bad until it’s too late. To get ahead of this customer churning, many cloud platforms offer their services to model customer behavior to empower businesses to maximize customer retention and profit.
Outcome: H2O.ai helped the Paypal data science team upscale their current churn models. This allowed Paypal to take quicker action and re-engage the customers to prevent churning. Specifically, H2O.ai allowed for multiple runs of the same model with an evaluation of the difference in churn probability for each model, and overall faster processing. Additionally, H2O.ai produced assessments of which features were the most impactful for the model.
“AI Democratization — AI for everyone” — H2O.ai
This thought piece has endeavored to provide an understanding of the components of AI operationalization and an overview of the AI Ecosystem.
It is also useful to review to identify the most pressing challenge that many enterprises face as they journey towards AI adoption. As highlighted previously, despite the enormous attention and investment in AI, most initiatives fail. Although the ecosystem is constantly maturing, key areas still lag customer requirements and create significant obstacles to executive buy-in and execution.
From our perspective, the biggest gap falls within the Cloud Dataflow component. The majority of vendors in this space have designed architectures for dealing with Big Data analytics. However, ML requires much more complexity — increasing amounts of structured and unstructured data, the management and processing of larger file sizes, affordable compute, and storage capabilities. In the past, additional hardware provisioning and more manpower were tactics that could address these challenges. But ML initiatives are more complex and cannot be solved by simply throwing bodies and servers at the problem.
There is promise in new architectures that are starting to receive more attention. In particular, the Apache Pulsar project has shown promise as a platform that has been designed specifically for AI. Cloud-native, separation of storage and compute, infinite horizontal scalability are all key attributes that Apache Pulsar delivers within its core architecture. As the amount of data explodes, human capital continues to be in short supply, and companies continue to move towards AI to gain a competitive advantage, we believe that the ability to make data available for AI Models will be a competitive differentiator to companies that are serious about operationalizing these technologies.
The Data Standard is a meaningful community of Data Scientist and Data Engineers built with love, empathy, and mindfulness. We aim to foster conversations and share insight among the leading professionals in data science.
This is a working paper and here at The Data Standard we encourage the sharing of knowledge and learning from others. If you have any feedback or would like to share your thoughts feel free to comment below or email Stephanie directly at email@example.com.
About the Author
Stephanie Moore is a Data Scientist at The Data Standard, the premier user community for Data Science, AI, ML, and Cyber Security thought-leaders. Stephanie’s background is in data wrangling, machine learning, and data visualization. Prior to working at The Data Standard Stephanie graduated from the University of California San Diego as a scholar-athlete and Data Science major. She is passionate about turning complex data into actionable insights through analysis and storytelling with the hopes of using her skills to create innovative solutions with a worldwide impact.