Today, at Intel Labs Day 2020, Intel revealed ControlFlag, a machine programming system that uses machine learning to detect errors in code. Trained on over 1-billion unlabeled lines of production-quality code that contained various bugs, ControlFlag uses a technique dubbed as ‘anomaly detection’ to detect traditional coding patterns and identify any potential anomalies in code that are likely to cause a bug, irrespective of the programming language.
The system extends the tech giant’s Rapid Analysis of Developers project that aims to help software engineers and researchers write code faster. It uses unsupervised learning to train itself to identify patterns and stylistic choices in code. Intel notes that ControlFlag understands code in a way that it does not characterize a difference in stylistic choices as a syntax error just because it is ‘written differently’. An apt analogy would be to compare its working to a traditional grammar-checking tool that checks a given sentence or a set of words in the English language for correctness.
When put to the test, ControlFlag was able to identify bugs in production-quality code. In one case, it even identified an anomaly in a cURL code that had not been previously recognized when developers were reviewing the code. Furthermore, in-house, Intel has already started using the system in software and firmware productization.
Justin Gottschlich, who is the Principal Scientist, Director andFounder of Machine Programming Research at Intel Labs believes that the system can “dramatically reduce the time and money required to evaluate and debug code.” This would be beneficial, he added, since, “According to studies, software developers spend approximately 50% of the time debugging. With ControlFlag, and systems like it, I imagine a world where programmers spend notably less time debugging and more time on what I believe human programmers do best — expressing creative, new ideas to machines.”