Microsoft, OpenAI may have solved a fundamental AI bottleneck


Microsoft and Open AI have developed a new method for fine-tuning massive AI models that are otherwise too expensive to retrain, such as GPT-3.
A blog post published by Microsoft Research describes a technique called µ-Parametrization (or µP), which plays on the discovery of similarities between the behaviour of small- and large-scale AI models to minimize the quantity of compute resources required to make optimizations.
Although you’d need a doctorate to make sense of the specifics, the essential message is this: with µ-Parametrization, it will be cheaper and simpler to develop larger-scale AI models capable of yielding far superior performance to those available today.
Optimizing AI models
As explained in the blog post, one reason large AI models are difficult to train effectively is because we have little insight into the way their behavior changes as they scale. As such, the larger the AI model, the less well-tuned researchers would currently expect it to be.
However, µ-Parametrization offers a route to tuning large-scale models at much lower costs and much greater efficiency, by capitalizing on the insight that neural networks of varying sizes share the same optimal hyperparameters (HPs) in some conditions.
Essentially, this means a small-scale tuning process can be extrapolated outwards and mapped onto a much larger model, instead of retraining an entire multi-billion-parameter model from scratch.
“µP’s principled way of parameterizing the model and selecting the learning rate make it easier for anybody to scale the training of deep neural networks. Such an elegant combination of beautiful theory and practical impact,” said Johannes Gehrke, Lab Director at Microsoft Research.
To put the theory into practice, Microsoft worked with OpenAI to unleash µ-Parametrization on GPT-3, a natural language model whose largest iteration is made up of 175 billion parameters.
“After parameterizing a version of GPT-3 with relative attention in µP, we tuned a small proxy model with 40 million parameters before copying the best hyperparameter combination to the 6.7-billion parameter variant of GPT-3,” Microsoft explained.
The results were quite startling; the collaborators managed to create an even more performant version of GPT-3, using just 7% of the compute power consumed in the pretraining of the 6.7-billion parameter model.
To help other practitioners benefit from these findings, Microsoft has published a PyTorch package designed to help integrate µ-Parametrization into their existing models, which can supposedly be finicky in practice.
The company also says there remains plenty that is yet to be understood about the scaling of AI models, however, and pledged to continue its work to “derive more principled approaches to large-scale machine learning”.
Audio player loading… Microsoft and Open AI have developed a new method for fine-tuning massive AI models that are otherwise too expensive to retrain, such as GPT-3. A blog post published by Microsoft Research describes a technique called µ-Parametrization (or µP), which plays on the discovery of similarities between the…
Recent Posts
- Everything new on Disney+ in March 2025: Marvel’s Daredevil: Born Again, Moana 2, Sadie Sink’s O’Dessa movie, and more
- The best Apple Watch in 2025
- Volvo ES90 will charge faster, drive farther than other Volvo EVs
- The truth about GenAI security: your business can’t afford to “wait and see”
- H&R Block Coupons and Deals: 20% Off Tax Prep in 2025
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010