IBM releases Project CodeNet, a 14-million-sample dataset, that will automate programming tasks

SaaS News -IBM’s-Project-CodeNet-will-test-how-far-you-can-push-AI-to-write-software(source: SaaS Industry)
Powered by SaaS Industry
At a Glance

Project CodeNet, a 14-million-sample dataset released by IBM’s AI research arm, would help create machine learning models for programming tasks. The dataset contains 14 million code samples, totaling 500 million lines of code written in 55 programming languages. The code samples were collected from over 4,000 challenges on AIZU and AtCoder, two common online programming platforms.

IBM’s AI research arm has released a 14-million-sample dataset, called Project CodeNet, to aid in creating machine learning models for programming tasks.

It’s unlikely that machine learning models based on the CodeNet dataset will render human programmers obsolete, but there’s reason to be optimistic that they will increase developer productivity.

IBM developers have attempted to build a multi-purpose dataset that can be used to train machine learning models for various tasks with Project CodeNet. According to its developers, CodeNet is a vast scale, diverse, and high-quality dataset to speed up algorithmic developments in AI for Code.

There are 14 million code samples in the dataset, totaling 500 million lines of code written in 55 different programming languages. The code samples were gathered from approximately 4,000 challenges on AIZU and AtCoder, two popular online coding platforms. Both right and incorrect responses to the challenges are included in the code samples.

The IBM researchers also went to considerable lengths to ensure that the dataset is balanced around many dimensions, such as programming language, approval, and error types.

Machine learning models for code recommendation can also be developed with CodeNet’s assistance. Easy autocomplete-style templates that finish the current line of code to more sophisticated structures that write full functions or blocks of code are examples of recommendation tools.

Data scientists can use CodeNet to build code optimization schemes because it contains a plethora of metadata about memory and execution time metrics. They can also teach deep learning algorithms to detect possible vulnerabilities in source code using the error-type metadata.

The findings, which are detailed in a paper about Project CodeNet, indicate that they achieved accuracy levels of over 90% in most activities.

Even though there are many advancements in machine learning and artificial intelligence, there will always be a need for manual programmers. But there may be a change in the way the tasks are done.     

Previous News Post
Next News Post
Related Posts

Become SaaS smart in just 5 minutes

Get the daily email that makes reading the SaaS news actually enjoyable. Stay informed and stimulated, for free