Alexa now runs on more powerful cloud instances, opening the door for complex new features Amazon Echo Dot


Amazon’s cloud computing voice service Alexa is about to get a whole lot more powerful as the Amazon Alexa team has migrated the vast majority of its GPU-based machine inference workloads to Amazon EC2 Inf1 instances.
These new instances are powered by AWS Inferentia and the upgrade has resulted in 25 percent lower end-to-end latency and 30 percent lower cost compared to GPU-based instances for Alexa’s text-to-speech workloads.
As a result of switching to EC2 Inf1 instances, Alexa engineers will now be able to begin using more complex algorithms in order to improve the overall experience for owners of the new Amazon Echo and other Alexa-powered devices.
In addition to Amazon Echo devices, more than 140,000 models of smart speakers, lights, plugs, smart TVs and cameras are powered by Amazon’s cloud-based voice service. Each month, tens of millions of customers interact with Alexa to control their home devices, listen to music and the radio, stay informed or to be educated and entertained with the more than 100,000 Alexa Skills available for the platform.
In a press release, AWS technical evangelist Sébastien Stormacq explained why the Amazon Alexa team decided to move from GPU-base machine inference workloads, saying:
“Alexa is one of the most popular hyperscale machine learning services in the world, with billions of inference requests every week. Of Alexa’s three main inference workloads (ASR, NLU, and TTS), TTS workloads initially ran on GPU-based instances. But the Alexa team decided to move to the Inf1 instances as fast as possible to improve the customer experience and reduce the service compute cost.”
AWS Inferentia
AWS Inferentia is a custom chip built by AWS to accelerate machine learning inference workloads while also optimizing their cost.
Each chip contains four NeuronCores and each core implements a high-performance systolic array matrix multiply engine which helps massively speed up deep learning operations such as convolution and transformers. NeuronCores also come equipped with a large on-chip cache that cuts down on external memory accesses to dramatically reduce latency while increasing throughput.
For users wishing to take advantage of AWS Inferentia, the custom chip can be used natively from popular machine learning frameworks including TensorFlow, PyTorch and MXNet with the AWS Neuron software development kit.
In addition to the Alexa team, Amazon Rekognition is also adopting the new chip as running models such as object classification on Inf1 instances resulted in eight times lower latency and doubled throughput when compared to running these models on GPU instances.
Amazon’s cloud computing voice service Alexa is about to get a whole lot more powerful as the Amazon Alexa team has migrated the vast majority of its GPU-based machine inference workloads to Amazon EC2 Inf1 instances. These new instances are powered by AWS Inferentia and the upgrade has resulted in…
Recent Posts
- Windows 11 24H2 hasn’t raised the bar for the operating system’s CPU requirements, Microsoft clarifies
- Acer is the first to raise laptop prices because of Trump
- OpenSSH vulnerabilities could pose huge threat to businesses everywhere
- Magic: The Gathering’s Final Fantasy sets will tell the stories of the games
- All of Chipolo’s Bluetooth trackers are discounted in sitewide sale
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010