The goal is to simplify the integration and scaling of large data and AI workflows in the hybrid cloud.
IBM announced CodeFlare’s open source, serverless framework in early July to simplify the integration and efficient scaling of large data and AI workflows in the hybrid cloud. CodeFlare is based on Ray, an open source distributed computing framework developed for machine learning applications. According to IBM, CodeFlare expands Ray’s capabilities with special features that make it easier to scale workflows.
Data and machine learning analytics are prevalent in almost every industry, and the tasks are becoming much more complex, according to the company. It is important to have larger datasets and more systems available for AI research as these workflows become more complex, researchers spend more and more time configuring settings instead of cultivating data science.
To create a machine learning model today, researchers and developers must first teach and optimize the model, IBM says. This may include data cleansing, feature extraction, and model optimization. CodeFlare aims to simplify this process with a Python-based interface called IBM Pipeline, facilitating integration, parallelization, and data sharing.
The company says its new framework aims to unify pipeline workflows across multiple platforms without data scientists having to learn a new workflow language.
CodeFlare pipelines run on IBM’s new serverless platform, IBM Cloud Code Engine and Red Hat OpenShift. This will allow users to deploy CodeFlare almost anywhere, extending the benefits of serverlessness to data scientists and AI researchers, Big Blue said. This also facilitates integration and bridge-building with other cloud-native ecosystems by providing adapters for triggers such as the arrival of a new file and for loading and partitioning data from a wide range of sources such as cloud-based object repositories, data lakes, and distributed file systems.
“CodeFlare goes beyond isolated tasks to seamlessly integrate and scale end-to-end pipelines with a data scientist-friendly interface such as Python instead of using containers. CodeFlare can provide an easier way to integrate and scale entire pipelines while maintaining a uniform runtime. it offers time and programming interface, ”said Priya Nagpurkar, Director of Hybrid Cloud Platform at IBM Research.
Experience to date has shown that technology saves a huge amount of time in many areas, such as pharmaceutical research or electronics design. IBM also publishes a technical guide to support the open source solution.