Etched is building an AI chip that only runs one type of model

Data moving through a circuit board with CPU in the center.

Image Credits: Ignatiev / Getty Images

As generative AI touches a growing number of industries, the companies producing chips to run the models are benefiting enormously. Nvidia, in particular, wields massive influence, commanding an estimated 70% to 95% of the market for AI chips. Cloud providers from Meta to Microsoft are spending billions of dollars on Nvidia GPUs, wary of falling behind in the generative AI race.

It’s understandable, then, that generative AI vendors aren’t pleased with the status quo. A large portion of their success hinges on the whims of the dominant chipmakers. And so they, along with opportunist VCs, are on the hunt for promising upstarts to challenge the AI chip incumbents.

Etched is among the many, many alternative chip companies vying for a seat at the table — but it’s also among the most intriguing. Only two years old, Etched was founded by a pair of Harvard dropouts, Gavin Uberti (ex-OctoML and ex-Xnor.ai) and Chris Zhu, who along with Robert Wachen and former Cypress Semiconductor CTO Mark Ross, sought to create a chip that could do one thing: run AI models.

That’s not unusual. Plenty of startups and tech giants are developing chips that exclusively run AI models, also known as inferencing chips. Meta has MTIA, Amazon has Graviton and Inferentia, and so on. But Etched’s chips are unique in that they only run a single type of model: Transformers.

The transformer, proposed by a team of Google researchers back in 2017, has become the dominant generative AI model architecture by far.

Transformers underpin OpenAI’s video-generating model Sora. They’re at the heart of text-generating models like Anthropic’s Claude and Google’s Gemini. And they power art generators such as the newest version of Stable Diffusion.

“In 2022, we made a bet that transformers would take over the world,” Uberti, Etched’s CEO, told TechCrunch in an interview. “We’ve hit a point in the evolution of AI where specialized chips that can perform better than general-purpose GPUs are inevitable — and the technical decision-makers of the world know this.”

Etched’s chip, called Sohu, is an ASIC (application-specific integrated circuit) — a chip tailored for a particular application — made for running transformers. Manufactured using TSMC’s 4nm process, Sohu can deliver dramatically better inferencing performance than GPUs and other general-purpose AI chips while drawing less energy, claims Uberti.

“Sohu is an order of magnitude faster and cheaper than even Nvidia’s next generation of Blackwell GB200 GPUs when running text, image and video transformers,” Uberti said. “One Sohu server replaces 160 H100 GPUs. … Sohu will be a more affordable, efficient and environmentally friendly option for business leaders that need specialized chips.”

How does Sohu achieve all this? In a few ways, but the most obvious (and intuitive) is a streamlined inferencing hardware-and-software pipeline. Because Sohu doesn’t run non-transformer models, the Etched team could do away with hardware components not relevant to transformers and trim the software overhead traditionally used for deploying and running non-transformers.

Etched
A graph from Etched comparing hardware performance running Meta’s open model Llama 70B.
Image Credits: Etched

Etched is arriving on the scene at an inflection point in the race for generative AI infrastructure. Beyond cost concerns, the GPUs and other hardware components necessary to run models at scale today are dangerously power-hungry.

Goldman Sachs predicts that AI is poised to drive a 160% increase in data center electricity demand by 2030, contributing to a significant uptick in greenhouse gas emissions. Researchers at UC Riverside, meanwhile, estimate that global AI usage could cause data centers to suck up 1.1 trillion to 1.7 trillion gallons of fresh water by 2027, impacting local resources. (Many data centers use water to cool servers.)

Uberti optimistically — or bombastically, depending on how you interpret it — pitches Sohu as the solution to the industry’s consumption problem.

“In short, our future customers won’t be able to afford not to switch to Sohu,” Uberti said. “Companies are willing to take a bet on Etched because speed and cost are existential to the AI products they are trying to build.”

But can Etched, assuming it meets its goal of bringing Sohu to the mass market in the next few months, succeed when so many others are following close behind it?

The company lacks a direct competitor at present, but AI chip startup Perceive recently previewed a processor with hardware acceleration for transformers. Groq has also invested heavily in transformer-specific optimizations for its ASIC.

Competition aside, what if transformers one day fall out of favor? Uberti says, in that case, Etched will do the obvious: Design a new chip. Fair enough, but that’s a pretty drastic fallback option, considering how long it’s taken to bring Sohu to fruition.

None of these concerns have dissuaded investors from pouring an enormous amount of money into Etched, though.

Today, Etched said it has closed a $120 million Series A funding round, co-led by Primary Venture Partners and Positive Sum Ventures. Bringing Etched’s total raised to $125.36 million, the round saw participation from heavyweight angel backers including Peter Thiel (Uberti, Zhu and Wachen are Thiel Fellowship alums), GitHub CEO Thomas Dohmke, Cruise (and the Bot Company) co-founder Kyle Vogt, and Quora co-founder Charlie Cheever.

These investors presumably believe Etched has a reasonable chance of successfully scaling up its business of selling servers. Perhaps it does — Uberti claims unnamed customers have reserved “tens of millions of dollars” in hardware so far. The forthcoming launch of the Sohu Developer Cloud, which will let customers preview Sohu via an online interactive playground, should drive additional sales, Uberti suggested.

Still, it seems too early to tell whether this will be enough to propel Etched and its 35-person team into the future its co-founders are envisioning. The AI chip segment can be unforgiving in the best of times — see the high-profile near-failures of AI chip startups like Mythic and Graphcore, and the declining investment in AI chip ventures in 2023.

Uberti makes a strong sales pitch, though: “Video generation, audio-to-audio modalities, robotics, and other future AI use cases will only be possible with a faster chip like Sohu. The entire future of AI technology will be shaped by whether the infrastructure can scale.”

Qualcomm's new Snapdragon chip aims to bring 5G to sub-$100 devices

Snapdragon 4s Gen 2

Image Credits: Qualcomm

Qualcomm Tuesday announced the Snapdragon 4s Gen 2. The entry-level chip aims to make 5G accessible to 2.8 billion smartphone users in price-sensitive markets, including India and Latin America.

Manufacturers lately have struggled to attract feature phone buyers in India, the world’s second-biggest smartphone market, primarily due to a lack of 5G models in the sub-$100 segment.

According to IDC, this entry-level segment in the Indian smartphone market declined by 14% year-on-year to 15% share, with 5.1 million smartphone units shipped in the first quarter. China’s Xiaomi and spin-off brand Poco continue to lead the segment. Overall, the country’s smartphone market grew 11% year-on-year to 34 million, dominated by the $200-$400 segment.

Qualcomm aims to fill the gap in the entry-level smartphone segment with this new chip. Xiaomi will launch its first device based on the 4s Gen 2 in India later this year, the chipmaker confirmed to TechCrunch.

TechCrunch recently reported that India telecom giant Jio was exploring the development of 5G feature phones, as millions of users in the country are not upgrading to smartphones.

The Snapdragon 4s Gen 2 chip is a feature limited version of the Snapdragon 4 Gen 2, which was launched in June last year. Based on the 4nm process technology, the new chip includes an octa-core Kryo CPU, which comprises two performance cores that support up to 2GHz clock speed and six efficiency cores at up to 1.8GHz.

The Snapdragon 4s Gen 2 also lacks support for 5G networks in the non-stand-alone 5G deployment mode, which is currently widespread and allows telcos to offer high-speed connectivity using their existing network assets. The new chip instead supports connectivity through the stand-alone 5G deployment.

In India, Jio is the only telecom operator offering 5G networks in stand-alone mode. However, Airtel, the second largest telecom in the country, will launch a version in the future.

Kiranjeet Kaur, associate research director for IDC Asia/Pacific, told TechCrunch that while smartphone vendors in India “have been more aggressive” about delivering 5G phones at lower price points, the momentum is slower outside of the country.

Cost-cutting measures have helped Qualcomm reduce the price of its Snapdragon 4s Gen 2 chip, enabling it for future lower-end 5G phones in India and other emerging markets.

“Some concerns remain — what is the benefit to this user to migrate to 5G, the additional cost for device/plan, or any compromises he needs to make in the smartphone feature set in order to have a lower-priced 5G smartphone?” Kaur explains. “But I do think more support from the component side will help OEMs and also drive more competition in the lower price segment to build some momentum here.”

Qualcomm's new Snapdragon chip aims to bring 5G to sub-$100 devices

Snapdragon 4s Gen 2

Image Credits: Qualcomm

Qualcomm Tuesday announced the Snapdragon 4s Gen 2. The entry-level chip aims to make 5G accessible to 2.8 billion smartphone users in price-sensitive markets, including India and Latin America.

Manufacturers lately have struggled to attract feature phone buyers in India, the world’s second-biggest smartphone market, primarily due to a lack of 5G models in the sub-$100 segment.

According to IDC, this entry-level segment in the Indian smartphone market declined by 14% year-on-year to 15% share, with 5.1 million smartphone units shipped in the first quarter. China’s Xiaomi and spin-off brand Poco continue to lead the segment. Overall, the country’s smartphone market grew 11% year-on-year to 34 million, dominated by the $200-$400 segment.

Qualcomm aims to fill the gap in the entry-level smartphone segment with this new chip. Xiaomi will launch its first device based on the 4s Gen 2 in India later this year, the chipmaker confirmed to TechCrunch.

TechCrunch recently reported that India telecom giant Jio was exploring the development of 5G feature phones, as millions of users in the country are not upgrading to smartphones.

The Snapdragon 4s Gen 2 chip is a feature limited version of the Snapdragon 4 Gen 2, which was launched in June last year. Based on the 4nm process technology, the new chip includes an octa-core Kryo CPU, which comprises two performance cores that support up to 2GHz clock speed and six efficiency cores at up to 1.8GHz.

The Snapdragon 4s Gen 2 also lacks support for 5G networks in the non-standalone 5G deployment mode, which is currently widespread and allows telcos to offer high-speed connectivity using their existing network assets. The new chip instead supports connectivity through the standalone 5G deployment.

In India, Jio is the only telecom operator offering 5G networks in standalone mode. However, Airtel, the second largest telecom in the country, will launch a version in the future.

Kiranjeet Kaur, associate research director for IDC Asia/Pacific, told TechCrunch that while smartphone vendors in India “have been more aggressive” about delivering 5G phones at lower price points, the momentum is slower outside of the country.

Cost cutting measures have helped Qualcomm reduce the price of its Snapdragon 4s Gen 2 chip, enabling it for future lower-end 5G phones in India and other emerging markets.

“Some concerns remain — what is the benefit to this user to migrate to 5G, the additional cost for device/plan, or any compromises he needs to make in the smartphone feature set in order to have a lower-priced 5G smartphone?” Kaur explains. “But I do think more support from the component side will help OEMs and also drive more competition in the lower price segment to build some momentum here.”

Etched is building an AI chip that only runs one type of model

Data moving through a circuit board with CPU in the center.

Image Credits: Ignatiev / Getty Images

As generative AI touches a growing number of industries, the companies producing chips to run the models are benefiting enormously. Nvidia, in particular, wields massive influence, commanding an estimated 70% to 95% of the market for AI chips. Cloud providers from Meta to Microsoft are spending billions of dollars on Nvidia GPUs, wary of falling behind in the generative AI race.

It’s understandable, then, that generative AI vendors aren’t pleased with the status quo. A large portion of their success hinges on the whims of the dominant chipmakers. And so they, along with opportunist VCs, are on the hunt for promising upstarts to challenge the AI chip incumbents.

Etched is among the many, many alternative chip companies vying for a seat at the table — but it’s also among the most intriguing. Only two years old, Etched was founded by a pair of Harvard dropouts, Gavin Uberti (ex-OctoML and ex-Xnor.ai) and Chris Zhu, who along with Robert Wachen and former Cypress Semiconductor CTO Mark Ross, sought to create a chip that could do one thing: run AI models.

That’s not unusual. Plenty of startups and tech giants are developing chips that exclusively run AI models, also known as inferencing chips. Meta has MTIA, Amazon has Graviton and Inferentia, and so on. But Etched’s chips are unique in that they only run a single type of model: Transformers.

The transformer, proposed by a team of Google researchers back in 2017, has become the dominant generative AI model architecture by far.

Transformers underpin OpenAI’s video-generating model Sora. They’re at the heart of text-generating models like Anthropic’s Claude and Google’s Gemini. And they power art generators such as the newest version of Stable Diffusion.

“In 2022, we made a bet that transformers would take over the world,” Uberti, Etched’s CEO, told TechCrunch in an interview. “We’ve hit a point in the evolution of AI where specialized chips that can perform better than general-purpose GPUs are inevitable — and the technical decision-makers of the world know this.”

Etched’s chip, called Sohu, is an ASIC (application-specific integrated circuit) — a chip tailored for a particular application — made for running transformers. Manufactured using TSMC’s 4nm process, Sohu can deliver dramatically better inferencing performance than GPUs and other general-purpose AI chips while drawing less energy, claims Uberti.

“Sohu is an order of magnitude faster and cheaper than even Nvidia’s next generation of Blackwell GB200 GPUs when running text, image and video transformers,” Uberti said. “One Sohu server replaces 160 H100 GPUs. … Sohu will be a more affordable, efficient and environmentally friendly option for business leaders that need specialized chips.”

How does Sohu achieve all this? In a few ways, but the most obvious (and intuitive) is a streamlined inferencing hardware-and-software pipeline. Because Sohu doesn’t run non-transformer models, the Etched team could do away with hardware components not relevant to transformers and trim the software overhead traditionally used for deploying and running non-transformers.

Etched
A graph from Etched comparing hardware performance running Meta’s open model Llama 70B.
Image Credits: Etched

Etched is arriving on the scene at an inflection point in the race for generative AI infrastructure. Beyond cost concerns, the GPUs and other hardware components necessary to run models at scale today are dangerously power-hungry.

Goldman Sachs predicts that AI is poised to drive a 160% increase in data center electricity demand by 2030, contributing to a significant uptick in greenhouse gas emissions. Researchers at UC Riverside, meanwhile, estimate that global AI usage could cause data centers to suck up 1.1 trillion to 1.7 trillion gallons of fresh water by 2027, impacting local resources. (Many data centers use water to cool servers.)

Uberti optimistically — or bombastically, depending on how you interpret it — pitches Sohu as the solution to the industry’s consumption problem.

“In short, our future customers won’t be able to afford not to switch to Sohu,” Uberti said. “Companies are willing to take a bet on Etched because speed and cost are existential to the AI products they are trying to build.”

But can Etched, assuming it meets its goal of bringing Sohu to the mass market in the next few months, succeed when so many others are following close behind it?

The company lacks a direct competitor at present, but AI chip startup Perceive recently previewed a processor with hardware acceleration for transformers. Groq has also invested heavily in transformer-specific optimizations for its ASIC.

Competition aside, what if transformers one day fall out of favor? Uberti says, in that case, Etched will do the obvious: Design a new chip. Fair enough, but that’s a pretty drastic fallback option, considering how long it’s taken to bring Sohu to fruition.

None of these concerns have dissuaded investors from pouring an enormous amount of money into Etched, though.

Today, Etched said it has closed a $120 million Series A funding round, co-led by Primary Venture Partners and Positive Sum Ventures. Bringing Etched’s total raised to $125.36 million, the round saw participation from heavyweight angel backers including Peter Thiel (Uberti, Zhu and Wachen are Thiel Fellowship alums), GitHub CEO Thomas Dohmke, Cruise (and the Bot Company) co-founder Kyle Vogt, and Quora co-founder Charlie Cheever.

These investors presumably believe Etched has a reasonable chance of successfully scaling up its business of selling servers. Perhaps it does — Uberti claims unnamed customers have reserved “tens of millions of dollars” in hardware so far. The forthcoming launch of the Sohu Developer Cloud, which will let customers preview Sohu via an online interactive playground, should drive additional sales, Uberti suggested.

Still, it seems too early to tell whether this will be enough to propel Etched and its 35-person team into the future its co-founders are envisioning. The AI chip segment can be unforgiving in the best of times — see the high-profile near-failures of AI chip startups like Mythic and Graphcore, and the declining investment in AI chip ventures in 2023.

Uberti makes a strong sales pitch, though: “Video generation, audio-to-audio modalities, robotics, and other future AI use cases will only be possible with a faster chip like Sohu. The entire future of AI technology will be shaped by whether the infrastructure can scale.”

child wearing VR headset

Qualcomm next-gen XR chip promises up to 4.3K resolution per eye

child wearing VR headset

Image Credits: Qualcomm

Just ahead of CES, Qualcomm today announced the next generation of its Snapdragon XR platform, the aptly named XR2+ Gen 2. The new system-on-a-chip promises up to a 4.3K resolution per eye at 90 frames per second (and a slightly reduced resolution at 120 fps), as well as a 2.5x GPU performance increase and 8x better AI performance, with full-color video see-through latency pegged at 12 milliseconds.

Throughout the last few years, Qualcomm built out its overall AR/VR/XR platform. These include the Snapdragon AR chips, which, for example, power the Meta/Ray-Ban smart glasses. The lineup is a bit complicated, with the AR1 Gen 1 meant for smart glasses without a screen, the AR2 Gen 1 chips for AR-enabled smart glasses, as well as the XR1 and XR2 chips. The XR2+ Gen 2 is the new flagship of the series, besting the previously released non-plus XR2 Gen 2, which “only” provided a 3K resolution. Like before, Qualcomm will keep the existing chips in production.

Image Credits: Qualcomm

“Snapdragon XR2+ Gen 2 unlocks 4.3K resolution which will take XR productivity and entertainment to the next level by bringing spectacularly clear visuals to use cases such as room-scale screens, life-size overlays and virtual desktops,” said Hugo Swart, vice president and general manager of XR at Qualcomm. “We are advancing our commitment to power the best XR devices and experiences that will supercharge our immersive future.”

One interesting aspect of today’s announcement is that Qualcomm is not just launching its own reference architecture but it is also partnering with Google and Samsung to bring this platform to their respective ecosystems. Other launch partners include HTC Vive, Immersed and Play for Dream (formerly YVR — the VR headset maker, not the airport in Vancouver, Canada).

Read more about CES 2024 on TechCrunch

Foxconn setting up chip packaging and testing venture with India's HCL

Foxconn City complex in Shenzhen, China

Image Credits: Thomas Lee/Bloomberg (opens in a new window) / Getty Images

Foxconn has set up a joint venture with Indian IT giant HCL Group to establish its semiconductor packaging and testing operations in India, as the Apple manufacturing partner looks to expand its presence in the South Asian nation to reduce reliance on China.

As a part of the deal, Foxconn Hon Hai Technology India Mega Development, a subsidiary of the Taiwanese company, will invest $37.2 million for a 40% stake, it disclosed in a stock exchange filing.

This will be Foxconn’s first investment toward setting up Outsourced Semiconductor Assembly And Test (OSAT) operations in India. The company has committed to putting billions of dollars into the country to bolster its domestic manufacturing for customers, including Apple and Xiaomi.

“Foxconn looks forward to jointly setting up OSAT operations in India with HCL. Through this investment, the partners aim to build an ecosystem and foster supply chain resilience for the domestic industry. Foxconn will deploy its BOL, or build-operate-localize, model to support local communities,” the Taiwanese company said in a statement to TechCrunch.

In November last year, Foxconn announced its plans to invest $1.5 billion in India to fulfil its “operational needs.” The company also partnered with local conglomerate Vedanta to set up a $20 billion semiconductor unit in the Indian state of Gujarat. However, it eventually pulled out of the deal in July while committing to “actively reviewing the landscape for optimal partners.”

The manufacturer also submitted a fresh application to start its semiconductor fabrication unit in the country later in the year, Deputy IT Minister Rajeev Chandrasekhar said in the parliament.

“HCL Group has a strong engineering and manufacturing heritage and this is an opportunity that provides strategic adjacency to the Group portfolio. This is line with Government of India’s vision of ‘Make in India’ and ‘Atmanirbhar Bharat’ also,” an HCL Group spokesperson said.

child wearing VR headset

Qualcomm next-gen XR chip promises up to 4.3K resolution per eye

child wearing VR headset

Image Credits: Qualcomm

Just ahead of CES, Qualcomm today announced the next generation of its Snapdragon XR platform, the aptly named XR2+ Gen 2. The new system-on-a-chip promises up to a 4.3K resolution per eye at 90 frames per second (and a slightly reduced resolution at 120 fps), as well as a 2.5x GPU performance increase and 8x better AI performance, with full-color video see-through latency pegged at 12 milliseconds.

Throughout the last few years, Qualcomm built out its overall AR/VR/XR platform. These include the Snapdragon AR chips, which, for example, power the Meta/Ray-Ban smart glasses. The lineup is a bit complicated, with the AR1 Gen 1 meant for smart glasses without a screen, the AR2 Gen 1 chips for AR-enabled smart glasses, as well as the XR1 and XR2 chips. The XR2+ Gen 2 is the new flagship of the series, besting the previously released non-plus XR2 Gen 2, which “only” provided a 3K resolution. Like before, Qualcomm will keep the existing chips in production.

Image Credits: Qualcomm

“Snapdragon XR2+ Gen 2 unlocks 4.3K resolution which will take XR productivity and entertainment to the next level by bringing spectacularly clear visuals to use cases such as room-scale screens, life-size overlays and virtual desktops,” said Hugo Swart, vice president and general manager of XR at Qualcomm. “We are advancing our commitment to power the best XR devices and experiences that will supercharge our immersive future.”

One interesting aspect of today’s announcement is that Qualcomm is not just launching its own reference architecture but it is also partnering with Google and Samsung to bring this platform to their respective ecosystems. Other launch partners include HTC Vive, Immersed and Play for Dream (formerly YVR — the VR headset maker, not the airport in Vancouver, Canada).

Read more about CES 2024 on TechCrunch

Foxconn setting up chip packaging and testing venture with India's HCL

Foxconn City complex in Shenzhen, China

Image Credits: Thomas Lee/Bloomberg (opens in a new window) / Getty Images

Foxconn has set up a joint venture with Indian IT giant HCL Group to establish its semiconductor packaging and testing operations in India, as the Apple manufacturing partner looks to expand its presence in the South Asian nation to reduce reliance on China.

As a part of the deal, Foxconn Hon Hai Technology India Mega Development, a subsidiary of the Taiwanese company, will invest $37.2 million for a 40% stake, it disclosed in a stock exchange filing.

This will be Foxconn’s first investment toward setting up Outsourced Semiconductor Assembly And Test (OSAT) operations in India. The company has committed to putting billions of dollars into the country to bolster its domestic manufacturing for customers, including Apple and Xiaomi.

“Foxconn looks forward to jointly setting up OSAT operations in India with HCL. Through this investment, the partners aim to build an ecosystem and foster supply chain resilience for the domestic industry. Foxconn will deploy its BOL, or build-operate-localize, model to support local communities,” the Taiwanese company said in a statement to TechCrunch.

In November last year, Foxconn announced its plans to invest $1.5 billion in India to fulfil its “operational needs.” The company also partnered with local conglomerate Vedanta to set up a $20 billion semiconductor unit in the Indian state of Gujarat. However, it eventually pulled out of the deal in July while committing to “actively reviewing the landscape for optimal partners.”

The manufacturer also submitted a fresh application to start its semiconductor fabrication unit in the country later in the year, Deputy IT Minister Rajeev Chandrasekhar said in the parliament.

“HCL Group has a strong engineering and manufacturing heritage and this is an opportunity that provides strategic adjacency to the Group portfolio. This is line with Government of India’s vision of ‘Make in India’ and ‘Atmanirbhar Bharat’ also,” an HCL Group spokesperson said.

Data moving through a circuit board with CPU in the center.

Semron wants to replace chip transistors with 'memcapacitors'

Data moving through a circuit board with CPU in the center.

Image Credits: Ignatiev / Getty Images

A new Germany-based startup, Semron, is developing what it describes as “3D-scaled” chips to run AI models locally on smartphones, earbuds, VR headsets and other mobile devices.

Co-created by Kai-Uwe Demasius and Aron Kirschen, engineering graduates from the Dresden University of Technology, Semron’s chips use electrical fields to perform calculations instead of electrical currents — the medium of conventional processors. This enables the chips to achieve higher energy efficiency while keeping the fabrication costs to produce them down, Kirschen claims.

“Due to an expected shortage in AI compute resources, many companies with a business model that [relies] on access to such capabilities risk their existence — for example, large startups that train their own models,” Kirschen told TechCrunch in an email interview. “The unique features of our technology will enable us to hit the price point of today’s chips for consumer electronics devices even though our chips are capable of running advanced AI, which others are not.”

Semron’s chips — for which Demasius and Kirschen filed an initial patent in 2016, four years before they founded Semron — tap a somewhat unusual component known as a “memcapacitor,” or a capacitor with memory, to run computations. The majority of computer chips are made of transistors, which unlike capacitors can’t store energy; they merely act like “on/off” switches, either letting an electric current through or stopping it.

Semron’s memcapacitors, made out of conventional semiconductor materials, work by exploiting a principle known in chemistry as charge shielding. The memcapacitors control an electric field between a top electrode and bottom electrode via a “shielding layer.” The shielding layer, in turn, is controlled by the chip’s memory, which can store the different “weights” of an AI model. (Weights essentially act like knobs in a model, manipulating and fine-tuning its performance as it trains on and processes data.)

The electric field approach minimizes the movement of electrons at the chip level, reducing energy usage — and heat. Semron aims to leverage the heat-reducing properties of the electric field to place potentially hundreds of layers of memcapacitors on a single chip — greatly increasing compute capacity.

Semron
A schematic showing Semron’s 3D AI chip design. Image Credits: Semron

“We use this property as an enabler to deploy several hundred times the compute resources on a fixed silicon area,” Kirschen added. “Think of it like hundreds of chips in one package.”

In a 2021 study published in the journal Nature Electronics, researchers at Semron and the Max Planck Institute of Microstructure Physics successfully trained a computer vision model at energy efficiencies of over 3,500 TOPS/W — 35 to 300 times higher than existing techniques. TOPS/W is a bit of a vague metric, but the takeaway is that memcapacitors can lead to dramatic energy consumption reductions while training AI models.

Now, it’s early days for Semron, which Kirschen says is in the “pre-product” stage and has “negligible” revenue to show for it. Often the toughest part of ramping up a chip startup is mass manufacturing and attaining a meaningful customer base — albeit not necessarily in that order.

Making matters more difficult for Semron is the fact that it has stiff competition in custom chip ventures like Kneron, EnCharge and Tenstorrent, which have collectively raised tens of millions of dollars in venture capital. EnCharge, like Semron, is designing computer chips that use capacitors rather than transistors, but using a different substrate architecture.

Semron, however — which has an 11-person workforce that it’s planning to grow by around 25 people by the end of the year — has managed to attract funding from investors, including Join Capital, SquareOne, OTB Ventures and Onsight Ventures. To date, the startup has raised 10 million euro (~$10.81 million).

Said SquareOne partner Georg Stockinger via email:

“Computing resources will become the ‘oil’ of the 21st century. With infrastructure-hungry large language models conquering the world and Moore’s law reaching the limits of physics, a massive bottleneck in computing resources will shape the years to come. Insufficient access to computing infrastructure will greatly slow down productivity and competitiveness both of companies and entire nation-states. Semron will be a key element in solving this problem by providing a revolutionary new chip that is inherently specialized on computing AI models. It breaks with the traditional transistor-based computing paradigm and reduces costs and energy consumption for a given computing task by at least 20x.”

Meta negotiations with moderators in Kenya over labor dispute collapse

Meta unveils its newest custom AI chip as it races to catch up

Meta negotiations with moderators in Kenya over labor dispute collapse

Image Credits: TechCrunch

Meta, hell-bent on catching up to rivals in the generative AI space, is spending billions on its own AI efforts. A portion of those billions is going toward recruiting AI researchers. But an even larger chunk is being spent developing hardware, specifically chips to run and train Meta’s AI models.

Meta unveiled the newest fruit of its chip dev efforts today, conspicuously a day after Intel announced its latest AI accelerator hardware. Called the “next-gen” Meta Training and Inference Accelerator (MTIA), the successor to last year’s MTIA v1, the chip runs models including for ranking and recommending display ads on Meta’s properties (e.g. Facebook).

Compared to MTIA v1, which was built on a 7nm process, the next-gen MTIA is 5nm. (In chip manufacturing, “process” refers to the size of the smallest component that can be built on the chip.) The next-gen MTIA is a physically larger design, packed with more processing cores than its predecessor. And while it consumes more power — 90W versus 25W — it also boasts more internal memory (128MB versus 64MB) and runs at a higher average clock speed (1.35GHz up from 800MHz).

Meta says the next-gen MTIA is currently live in 16 of its data center regions and delivering up to 3x overall better performance compared to MTIA v1. If that “3x” claim sounds a bit vague, you’re not wrong — we thought so too. But Meta would only volunteer that the figure came from testing the performance of “four key models” across both chips.

“Because we control the whole stack, we can achieve greater efficiency compared to commercially available GPUs,” Meta writes in a blog post shared with TechCrunch.

Meta’s hardware showcase — which comes a mere 24 hours after a press briefing on the company’s various ongoing generative AI initiatives — is unusual for several reasons.

One, Meta reveals in the blog post that it’s not using the next-gen MTIA for generative AI training workloads at the moment, although the company claims it has “several programs underway” exploring this. Two, Meta admits that the next-gen MTIA won’t replace GPUs for running or training models — but instead will complement them.

Reading between the lines, Meta is moving slowly — perhaps more slowly than it’d like.

Meta’s AI teams are almost certainly under pressure to cut costs. The company’s set to spend an estimated $18 billion by the end of 2024 on GPUs for training and running generative AI models, and — with training costs for cutting-edge generative models ranging in the tens of millions of dollars — in-house hardware presents an attractive alternative.

And while Meta’s hardware drags, rivals are pulling ahead, much to the consternation of Meta’s leadership, I’d suspect.

Google this week made its fifth-generation custom chip for training AI models, TPU v5p, generally available to Google Cloud customers, and revealed its first dedicated chip for running models, Axion. Amazon has several custom AI chip families under its belt. And Microsoft last year jumped into the fray with the Azure Maia AI Accelerator and the Azure Cobalt 100 CPU.

In the blog post, Meta says it took fewer than nine months to “go from first silicon to production models” of the next-gen MTIA, which to be fair is shorter than the typical window between Google TPUs. But Meta has a lot of catching up to do if it hopes to achieve a measure of independence from third-party GPUs — and match its stiff competition.