Since Intel doesn’t plan to have a desktop CPU with AI capabilities until later this year, PC makers are turning to chip startups instead — and the Lenovo ThinkCentre Neo Ultra may show the way, potentially sporting AI cards from MemryX and Kinara inside.
Lenovo will launch the ThinkCentre Neo Ultra PC in June for about $1,000, product manager Bryan Lin said from Lenovo’s booth at CES 2024. While Lenovo’s documentation does not officially list either AI processor, their inclusion is likely. The small content-creation desktop was at CES showcasing both AI cards.
While AMD, Intel, and Qualcomm have all shown mobile processors with integrated AI NPUs, only AMD has announced a desktop Ryzen processor with an APU inside. Intel, which holds the dominant share in the PC processor industry, will have to wait until the launch of Arrow Lake to make an NPU available for desktop PC makers.
Meanwhile, more PC makers are realizing that an “AI PC” can actually be constructed with just a CPU and a GPU, while NPUs provide more power-efficient AI. If you’re a desktop PC maker, with typically fewer concerns about power consumption, that may be sufficient. But businesses, which want to apply AI to making money, want AI now — and they do care about minimizing power consumption at scale. In this, at least, the business market may push ahead of consumer PCs.
Mark Hachman / IDG
“What we’re seeing now is that the discrete graphics card is too hungry in terms of form factor and power, thermal design, et cetera,” Lin said. “So an NPU card drawing about 5 to 10 watts can give us a certain level of AI capabilities.”
But what about when Arrow Lake debuts?
“With Arrow Lake what I’m getting is that it’s still very limited [in terms of] power,” Lin said. “So, at least 18 to 24 months from now, I think discrete [AI accelerators] will still be part of it. And especially for desktop, where we don’t have the limitation of battery.”
Mark Hachman / IDG
The ThinkCentre Neo Ultra will include up to an Intel Core i9 vPro processor of an undisclosed architecture, with up to 64GB of DDR5-5200 memory. It will also include a creator-class Nvidia GeForce RTX 4060 GPU, up to 4TB of SSD storage, with a 350W internal power supply. It’s a 3.6-liter chassis, measuring 7.67 x 7.67 x 4.21 inches.
Lenovo has what it is calling an AI engine, routing workloads to where it fits the most, Lin said.
Mark Hachman / IDG
Lin said that there are a number of AI chip startups that the company is working with, including MemryX and Kinara, the two AI chip companies being shown off at the booth.
Meet MemryX, one of the first AI accelerators
MemryX manufactures the MX3 Edge AI Accelerator. The company’s software development kit, and what Lenovo is showing off inside the ThinkCentre, is comprised of four MX3 chips mounted on an M.2 PCI Express card (Gen3, somewhat surprisingly), though it can run inside a USB 3.2 USB card as well.
MemryX rates each MX3 as capable of 10 TFLOPs (trillion floating-point operations) instead of the more conventional TOPS. That’s because the MX3 defaults to 16-bit floating-point operations and 8-bit weights by default, rather than the integer operations that are a more common metric, according to Roger Peene, the vice president of product and business development for MemryX.
“When there’s an opportunity to use discrete solutions, everybody will use it until Intel or AMD integrates it,” Peene said. “So everybody knows Intel’s way behind… they’ve amped up their marketing. They’re clearly not happy that Lenovo would choose a startup to run AI in a PC. So that’s kind of the story.”
Mark Hachman / IDG
Each MX3 consumes 1 to 2 watts on average, Peene said. The chips support Linux, Android, and Windows, as well as the TensorFlow, TensorFlow-lite, PyTorch, ONNX, and Keras frameworks.
Each chip can run a model with 10 million 8-bit parameters, scaled as necessary. Out of the box, the MX3 can perform YOLO v7 tiny at 416×416, 375fps (x2) without pruning or training, or SSDMobileNet (224×224) at 1403fps.
We haven’t had a chance to speak to Kinara, though the company launched its Ara-2 Edge AI processor last fall. “As an example of its capabilities for processing Generative AI models, Ara-2 can hit 10 seconds per image for Stable Diffusion and tens of tokens/sec for LLaMA-7B,” the company said in a press release.
Mark Hachman / IDG
Both the MemryX and Kinara AI chips are being positioned first as AI for image recognition, with one MemryX demo showing off how it could recognize whether construction workers had donned the right protective gear. Still, AI can be used for all sorts of purposes: games, avatars, local language models/chatbots, and more.
What’s more important, however, is that companies like Nvidia, Rendition, 3Dfx, and others launched years ago as 3D accelerators — and now, after some fell by the wayside, dominate the content-creation and gaming industry. Expect a new wave of AI accelerator cards to challenge them.