Microsoft believes its AI can accurately detect security bugs


Microsoft has announced that it has developed a new system that is able to correctly distinguish between security and non-security software bugs 99 percent of the time. The system is also able to accurately identify critical, high-priority security bugs on average 97 percent of the time.
Microsoft used a data set of 13m work items and bugs from 47,000 of its developers stored across AzureDevOps and GitHub repositories to develop a process and machine learning model that correctly distinguishes between security and non-security bugs. In the coming months, the company plans to open source the methodology on GitHub along with example models and other resources so that the system can be used to help support human experts.
While developing its model, security experts approved the training data and the statistical sampling that was used to provide them with a manageable amount of data to review. This data was then encoded into representations called feature vectors as researchers at Microsoft went about designing the system using a two-step process.
The model first learned to classify security and non-security bugs and then it learned to apply security labels (critical, important or low-impact) to those bugs.
Identifying security bugs
In order to make its bug predictions, Microsoft’s model leverages two techniques.
The first is an information retrieval approach called frequency-inverse document frequency algorithm (TF-IDF) which identifies how many times a word appears in a document and then checks how relevant the word is in a collection of titles. According to Microsoft, its bug titles are usually quite short and contain around 10 words.
The second technique the software giant uses is a logistic regression model that utilizes a logistic function to model the probability of a certain class or event existing.
In its blog post announcing the new system, Microsoft explained how it used machine learning models and security experts to better identify security bugs, saying:
“Every day, software developers stare down a long list of features and bugs that need to be addressed. Security professionals try to help by using automated tools to prioritize security bugs, but too often, engineers waste time on false positives or miss a critical security vulnerability that has been misclassified. To tackle this problem data science and security teams came together to explore how machine learning could help. We discovered that by pairing machine learning models with security experts, we can significantly improve the identification and classification of security bugs.”
Microsoft’s new bug detecting system has already been deployed in its internal production and it is also continually retrained with data approved by the company’s security experts who monitor how many bugs are generated during software development.
Via VentureBeat
Microsoft has announced that it has developed a new system that is able to correctly distinguish between security and non-security software bugs 99 percent of the time. The system is also able to accurately identify critical, high-priority security bugs on average 97 percent of the time. Microsoft used a data…
Recent Posts
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010