Samsung Unpacked 2024: What we expect and how to watch Wednesday's hardware event

Samsung Galaxy MWC 2024 banners

Image Credits: Brian Heater

Samsung Unpacked 2024 kicks off Wednesday at 6 a.m. PT/9 a.m. ET. Why so early for our West Coast pals? The Galaxy device showcase is happening in Paris this year, putting the local start time at 1 p.m. CET. Paris, as luck would have it, is also kicking off the Summer Olympics roughly a fortnight later — an event for which Samsung happens to be a massive sponsor.

If past is precedent — and it always is with this stuff — foldables will take center stage at the Samsung event. The company has adhered to a six-month flagship release cycle for several years now. Since Samsung retired the Note in 2022, that’s meant new Galaxy S devices in January and February and Galaxy Z Folds and Galaxy Z Flips over the summer. Along with the change in scenery, however, it’s looking like the Samsung event will be packed.

Other expected headliners include a lot of time spent on Galaxy AI, which already got a good bit of face time at the Galaxy S24 event. That Unpacked also offered a “one more thing” in the form of the Galaxy Ring, which broke coverage a little over a month later at MWC in Barcelona. Expect some concrete info on the new wearable, along with other accessories like Galaxy Buds.

You can watch Galaxy Unpacked live here.

Samsung Galaxy Z Fold and Z Flip 6 updates

The Samsung Galaxy Z Fold and Galaxy Z Flip 6.
Image Credits: Brian Heater

Obviously both new foldables have already leaked, along with most of the devices discussed below. Samsung’s core competencies are manufacturing components, making great hardware and leaking all of the above early.

The company’s all-in commitment to the foldable form factor has changed a lot of minds over the last several years. Plenty of questions about reliability and consumer interest swirled around the first few generations, but it’s safe to say that Samsung has proved that foldables are, indeed, a viable category.

Both new models will be powered by the Snapdragon 8 gen 3 process Qualcomm unveiled in Hawaii last fall (I really need to talk to someone about my travel budget). The Galaxy S24 series was among the first devices to get the new system-on-a-chip, a list that also includes devices from OnePlus and Xiaomi, while Vivo introduced a foldable with the silicon back in March.

The Fold 6 is reportedly getting some key design tweaks, making it thinner and lighter than its predecessor, with an adjusted 22:9 aspect ratio. That would put it more in line with the very good foldable offerings from Google and OnePlus. Other rumors point to a brighter screen and a new aluminum shell, while the Galaxy Fold is said to be getting an upgraded battery.

Galaxy AI

Image Credits: Brian Heater

As you’ve no doubt heard, international law now requires that all phone manufacturers spend half of every event talking about how cool AI is. Samsung devoted a good bit of its winter Unpacked to the subject, showcasing a number of new camera features and its Circle to Search partnership with Google. Google picked up the mantle at I/O in May, while at WWDC last month, we learned that “AI” apparently stands for “Apple Intelligence” now.

Samsung’s biggest challenge here is finding ways to set Galaxy AI apart from the myriad Gemini features that are coming to Android. Details on those plans are a little hazy beyond things like a new translation feature for WhatsApp.

Galaxy Ring release date and price

Image Credits: Samsung

Samsung threw us a fun curve ball at the last Unpacked event with the brief announcement of a new wearable. With the arrival of Galaxy Ring, the company takes a step into a category that has thus far been dominated by smaller names like Oura. When the product was unveiled, all of those companies offered us comments along the lines of being happy that Samsung has effectively validated their space.

Oura CEO Tom Hale told TechCrunch at the time, in part, “New players entering the space is validation for the category and drives us to aim higher to serve our members and community.”

As for what that validation will look like, Samsung will finally deliver concrete details. The Ring should get a release date and price; rumors are suggesting a $300 ballpark. That would put the device in line with Oura, but the question remains how much of the cost Samsung might tie up in a premium health software subscription (also like Oura’s).

Other things we’ll be looking out for at Samsung Unpacked

Samsung Galaxy Watch
Image Credits: Brian Heater

Samsung’s earbud philosophy has always been the smaller, the better. For years now, the company has relied on a small, spherical design that sits flush against the ear. It does the job for the most part, but this limits control, as there isn’t much surface area to interact with. Naturally, the company is adding stems — and an oblong charging case.

If the leaks thus far are any indication, a lot of people are going to ask you if they’re AirPods next time you wear them out. Given that the copyright holder has struck down many of the images posted to Twitter/X, there’s a good sign that the leakers are onto something here.

Said leakers have also posted images of what purports to be the Galaxy Watch 7. Samsung also confirmed the new wearable is coming soon with a pre-show announcement of a new BioActive sensor. The company notes:

The all-new BioActive Sensor is essential to bringing you better preventative health experiences on the next Galaxy Watch, with design improvements that enable even more precise health insights. Samsung engineers focused on three upgrades to the new sensor: enhancing the performance of light-receiving photodiodes, adding additional colors of light-emitting diodes (LEDs), and arranging them optimally across the sensor.

Between the upgraded Watch and the new Ring, it’s safe to assume the company is going to spend plenty of time talking up its health and fitness platform. Given Apple’s patent struggles of late, Samsung might as well strike while the iron is hot.

TechCrunch will bring you the news as it happens. Galaxy Watch this space.

NASA and Rocket Lab aim to prove we can go to Mars for 1/10 the price

Rocket Lab ESCAPADE

Image Credits: Rocket Lab (opens in a new window)

A pair of Rocket Lab-made spacecraft are about to embark on a two-step journey. The first step is the 55-hour, 2,500-mile stretch from California to the launch site at Cape Canaveral. The second step? Just 11 months and 230 million miles to Mars. 

The objective of the Escape and Plasma Acceleration and Dynamics Explorers (ESCAPADE) mission is to study the interaction between solar winds and the Martian atmosphere. The University of California, Berkeley’s Space Sciences Laboratory (SSL) developed the scientific payloads for the mission, but the satellite bus — the actual platform that will travel through space and host those payloads in an orbit around Mars — is all Rocket Lab. The mission is currently set to launch no earlier than October on the first launch of Blue Origin’s New Glenn rocket, according to NASA.

While the company is best known for its Electron rocket, which is second only to SpaceX’s Falcon 9 in terms of launch numbers, the majority of its revenue actually comes from building and selling spacecraft and spacecraft components. With ESCAPADE, Rocket Lab is looking to show both the space agency and the world that it can produce extremely high-performance spacecraft that are capable of journeying throughout the solar system. 

The company proved itself once when it built the satellite bus for NASA’s Cislunar Autonomous Positioning System Technology Operations and Navigation Experiment (CAPSTONE) mission to the moon in 2022. That spacecraft took a nearly five-month sojourn into deep space before entering lunar orbit. But getting to Mars takes significantly longer — and historically, it’s also been very, very expensive. Two recent missions that sent orbiters around the Red Planet, the Mars Reconnaissance Orbiter in 2005 and MAVEN in 2013, each cost NASA over a half billion dollars. 

So in 2019, the space agency established the Small Innovative Missions for Planetary Exploration (SIMPLEx) program to fund small spacecraft missions into deep space. Like other NASA programs established in recent years, it’s also an effort on the part of the agency to embrace risk. Instead of spending $550 million on a mission into deep space, NASA set a goal to spend just one-tenth of that and gave each SIMPLEx mission a $55 million price cap, excluding launch. ESCAPADE is one of three missions the agency selected under the SIMPLEx program, and in all likelihood, the first that will actually launch. 

Those funds went to the principal investigator for the mission, SSL, who contracted Rocket Lab for the two satellite buses. Rocket Lab isn’t saying how much of that $55 million went to them, but the lead systems engineer for ESCAPADE, Christophe Mandy, said the company was “two orders of magnitude cheaper than anything else.” 

The spacecraft, named Blue and Gold, are based on Rocket Lab’s Explorer platform (which gained flight heritage during CAPSTONE), known for its high delta-v capabilities to support missions of this kind. One of the biggest challenges for the Rocket Lab engineers was designing a spacecraft that can get from Earth orbit all the way to Mars; for that reason, the ESCAPADE spacecraft are about 70% fuel by mass. That fuel will make the spacecraft capable of about 3 kilometers per second of delta-v, or change in velocity, which is very high for a satellite of this size.

The two ESCAPADE spacecraft side by side.
Image Credits: Rocket Lab

The other big challenge is that Rocket Lab didn’t know the launch provider until relatively late into the design process, when NASA selected New Glenn in February 2023. This unknown affected what are called the “driving constraints” for the spacecraft, or the factors that shape the engineer’s design decisions.

“Almost every single spacecraft I’ve ever seen has had launch vehicle as a driving constraint, but because we didn’t know what the launch vehicle was going to be, we did that differently,” Mandy said. “So we made an enormous amount of effort to make it so that the launch vehicle was not [a] driving constraint, which is just very unusual.” 

Instead, Rocket Lab engineers ended up basing much of the spacecraft design on another variable: the maximum amount of mass the spacecraft can take through a critical maneuver called the Mars orbital insertion (MOI), which is the maneuver the spacecraft will perform in deep space to enter Martian orbit. 

“So the amount of mass we have on the system is driven by physics, rather than by something man-made, like the launch vehicle,” Mandy said. But once the launch vehicle was selected, “we didn’t have to do the redesign, because our design was driven by other requirements.” 

These constraints helped push engineers to innovate. Instead of a box, the two spacecraft are basically “tank sandwiches,” as Mandy called them, with two decks connected by struts, with the fuel tanks in the middle. Typically, the primary structure of a satellite accounts for around 20-22% of its total mass; on ESCAPADE, thanks to the sandwich design, that number is just 12%.

These changes have escalating effects, Mandy said: Less mass in the primary structure means less fuel for that, which means a different tank size, and so on. Engineers also designed the spacecraft so that all the components that tend to get hot, like the flight computer and the radio, are near the one deck of the spacecraft, while all the components that have a tendency to get cold, like the propulsion system, are near the other. These changes mean that the spacecraft will need less power, smaller solar panels, fewer heaters, and many other effects. 

After launch, the spacecraft will spend 11 months traveling to Mars before performing that critical MOI burn. But the sun will be between Earth and Mars when the spacecraft are expected to perform the burn, making timely communication with them impossible. Rocket Lab engineers will have to wait another three months or so before sending a command to the spacecraft to start circularizing its orbit. Then the spacecraft will collect and transmit scientific data back to Earth for around 11 months. 

Mandy declined to say the exact launch window for the mission, saying that it’s up to Blue Origin to determine, but he did say that now is the peak of efficiency for the spacecraft’s travel, and that window extends “through several months after the peak.” If Blue Origin misses the window, the two companies and NASA will have to wait another 26 months until the ESCAPADE spacecraft can start unlocking the secrets of Mars.

Stephen Wolfram thinks we need philosophers working on big questions around AI

Stephen Wolfram speaking at the Collision Conference in New Orleans in May 2018.

Image Credits: Stephen McCarthy / Getty Images

Mathematician and scientist Stephen Wolfram grew up in a household where his mother was a philosophy professor at Oxford University. As such, his younger self didn’t want anything to do with the subject, but an older and perhaps wiser Wolfram sees value in thinking deeply about things. Now he wants to bring some of that deep philosophical rigor to AI research to help us better understand the issues we encounter as AI becomes more capable.

Wolfram was something of a child prodigy, publishing his first scientific paper at 15 and graduating from Caltech with a doctorate at 20. His impressive body of work crosses science, math and computing: He developed Mathematica, Wolfram Alpha and the Wolfram Language, a powerful computational programming language.

“My main life work, along with basic science, has been building our Wolfram language computational language for the purpose of having a way to express things computationally that’s useful to both humans and computers,” Wolfram told TechCrunch.

As AI developers and others start to think more deeply about how computers and people intersect, Wolfram says it is becoming much more of a philosophical exercise, involving thinking in the pure sense about the implications this kind of technology may have on humanity. That kind of complex thinking is linked to classical philosophy.

“The question is what do you think about, and that’s a different kind of question, and it’s a question that’s found more in traditional philosophy than it is in the traditional STEM,” he said.

For example, when you start talking about how to put guardrails on AI, these are essentially philosophical questions. “Sometimes in the tech industry, when people talk about how we should set up this or that thing with AI, some may say, ‘Well, let’s just get AI to do the right thing.’ And that leads to, ‘Well, what is the right thing?’” And determining moral choices is a philosophical exercise.

He says he has had “horrifying discussions” with companies that are putting AI out into the world, clearly without thinking about this. “The attempted Socratic discussion about how you think about these kinds of issues, you would be shocked at the extent to which people are not thinking clearly about these issues. Now, I don’t know how to resolve these issues. That’s the challenge, but it’s a place where these kinds of philosophical questions, I think, are of current importance.”

He says scientists in general have a hard time thinking about things in philosophical terms. “One thing I’ve noticed that’s really kind of striking is that when you talk to scientists, and you talk about big, new ideas, they find that kind of disorienting because in science, that is not typically what happens,” he said. “Science is an incremental field where you’re not expecting that you’re going to be confronted with a major different way of thinking about things.”

If the main work of philosophy is to answer big existential questions, he sees us coming into a golden age of philosophy due to the growing influence of AI and all of the questions that it’s raising. In his view, a lot of the questions that we’re now being confronted with by AI are actually at their core of traditional philosophical questions.

“I find that the groups of philosophers that I talk to are actually much more agile when they think paradigmatically about different kinds of things,” he said.

One such meeting on his journey was with a group of masters’ philosophy students at Ralston College in Savannah, Georgia. Wolfram spoke to students there about the coming collision of liberal arts and philosophy with technology. In fact, Wolfram says he has reread Plato’s “Republic” because he wants to return to the roots of Western philosophy in his own thinking.

“And this question of ‘if the AIs run the world, how do we want them to do that? How do we think about that process? What’s the kind of modernization of political philosophy in the time of AI?’ These kinds of things, this goes right back to foundational questions that Plato talked about,” he told students.

Rumi Allbert, a student in the Ralston program, who has spent his career working in data science and also participated in Wolfram Summer School, an annual program designed to help students understand Wolfram’s approach to applying science to business ideas, was fascinated with Wolfram’s thinking.

“It’s very, very interesting that a guy like Dr. Wolfram has such an interest in philosophy, and I think that speaks to the volume of importance of philosophy and the humanistic approach to life. Because it seems to me, he has gotten so developed in his own field, [it has evolved] to more of a philosophical question,” Allbert said.

That Wolfram, who has been involved on the forefront of computer science for a half century, is seeing the connections between philosophy and technology, could be a signal that it’s time to start addressing these questions around AI usage in a much broader way than purely as a math problem. And perhaps bringing philosophers into the discussion is a good way to achieve that.

NASA and Rocket Lab aim to prove we can go to Mars for 1/10 the price

Image Credits: Rocket Lab (opens in a new window)

A pair of Rocket Lab-made spacecraft are about to embark on a two-step journey. The first step is the 55-hour, 2,500-mile stretch from California to the launch site at Cape Canaveral. The second step? Just 11 months and 230 million miles to Mars. 

The objective of the Escape and Plasma Acceleration and Dynamics Explorers (ESCAPADE) mission is to study the interaction between solar winds and the Martian atmosphere. The University of California, Berkeley’s Space Sciences Laboratory (SSL) developed the scientific payloads for the mission, but the satellite bus — the actual platform that will travel through space and host those payloads in an orbit around Mars — is all Rocket Lab. The mission is currently set to launch no earlier than October on the first launch of Blue Origin’s New Glenn rocket, according to NASA.

While the company is best known for its Electron rocket, which is second only to SpaceX’s Falcon 9 in terms of launch numbers, the majority of its revenue actually comes from building and selling spacecraft and spacecraft components. With ESCAPADE, Rocket Lab is looking to show both the space agency and the world that it can produce extremely high-performance spacecraft that are capable of journeying throughout the solar system. 

The company proved itself once when it built the satellite bus for NASA’s Cislunar Autonomous Positioning System Technology Operations and Navigation Experiment (CAPSTONE) mission to the moon in 2022. That spacecraft took a nearly five-month sojourn into deep space before entering lunar orbit. But getting to Mars takes significantly longer — and historically, it’s also been very, very expensive. Two recent missions that sent orbiters around the Red Planet, the Mars Reconnaissance Orbiter in 2005 and MAVEN in 2013, each cost NASA over a half billion dollars. 

So in 2019, the space agency established the Small Innovative Missions for Planetary Exploration (SIMPLEx) program to fund small spacecraft missions into deep space. Like other NASA programs established in recent years, it’s also an effort on the part of the agency to embrace risk. Instead of spending $550 million on a mission into deep space, NASA set a goal to spend just one-tenth of that and gave each SIMPLEx mission a $55 million price cap, excluding launch. ESCAPADE is one of three missions the agency selected under the SIMPLEx program, and in all likelihood, the first that will actually launch. 

Those funds went to the principal investigator for the mission, SSL, who contracted Rocket Lab for the two satellite buses. Rocket Lab isn’t saying how much of that $55 million went to them, but the lead systems engineer for ESCAPADE, Christophe Mandy, said the company was “two orders of magnitude cheaper than anything else.” 

The spacecraft, named Blue and Gold, are based on Rocket Lab’s Explorer platform (which gained flight heritage during CAPSTONE), known for its high delta-v capabilities to support missions of this kind. One of the biggest challenges for the Rocket Lab engineers was designing a spacecraft that can get from Earth orbit all the way to Mars; for that reason, the ESCAPADE spacecraft are about 70% fuel by mass. That fuel will make the spacecraft capable of about 3 kilometers per second of delta-v, or change in velocity, which is very high for a satellite of this size.

The two ESCAPADE spacecraft side by side.
Image Credits: Rocket Lab

The other big challenge is that Rocket Lab didn’t know the launch provider until relatively late into the design process, when NASA selected New Glenn in February 2023. This unknown affected what are called the “driving constraints” for the spacecraft, or the factors that shape the engineer’s design decisions.

“Almost every single spacecraft I’ve ever seen has had launch vehicle as a driving constraint, but because we didn’t know what the launch vehicle was going to be, we did that differently,” Mandy said. “So we made an enormous amount of effort to make it so that the launch vehicle was not [a] driving constraint, which is just very unusual.” 

Instead, Rocket Lab engineers ended up basing much of the spacecraft design on another variable: the maximum amount of mass the spacecraft can take through a critical maneuver called the Mars orbital insertion (MOI), which is the maneuver the spacecraft will perform in deep space to enter Martian orbit. 

“So the amount of mass we have on the system is driven by physics, rather than by something man-made, like the launch vehicle,” Mandy said. But once the launch vehicle was selected, “we didn’t have to do the redesign, because our design was driven by other requirements.” 

These constraints helped push engineers to innovate. Instead of a box, the two spacecraft are basically “tank sandwiches,” as Mandy called them, with two decks connected by struts, with the fuel tanks in the middle. Typically, the primary structure of a satellite accounts for around 20-22% of its total mass; on ESCAPADE, thanks to the sandwich design, that number is just 12%.

These changes have escalating effects, Mandy said: Less mass in the primary structure means less fuel for that, which means a different tank size, and so on. Engineers also designed the spacecraft so that all the components that tend to get hot, like the flight computer and the radio, are near the one deck of the spacecraft, while all the components that have a tendency to get cold, like the propulsion system, are near the other. These changes mean that the spacecraft will need less power, smaller solar panels, fewer heaters, and many other effects. 

After launch, the spacecraft will spend 11 months traveling to Mars before performing that critical MOI burn. But the sun will be between Earth and Mars when the spacecraft are expected to perform the burn, making timely communication with them impossible. Rocket Lab engineers will have to wait another three months or so before sending a command to the spacecraft to start circularizing its orbit. Then the spacecraft will collect and transmit scientific data back to Earth for around 11 months. 

Mandy declined to say the exact launch window for the mission, saying that it’s up to Blue Origin to determine, but he did say that now is the peak of efficiency for the spacecraft’s travel, and that window extends “through several months after the peak.” If Blue Origin misses the window, the two companies and NASA will have to wait another 26 months until the ESCAPADE spacecraft can start unlocking the secrets of Mars.

Samsung Galaxy MWC 2024 banners

Samsung Unpacked 2024: What we expect and how to watch Wednesday's hardware event

Samsung Galaxy MWC 2024 banners

Image Credits: Brian Heater

Samsung Unpacked 2024 kicks off Wednesday at 6 a.m. PT/9 a.m. ET. Why so early for our West Coast pals? The Galaxy device showcase is happening in Paris this year, putting the local start time at 1 p.m. CET. Paris, as luck would have it, is also kicking off the Summer Olympics roughly a fortnight later — an event for which Samsung happens to be a massive sponsor.

If past is precedent — and it always is with this stuff — foldables will take center stage at the Samsung event. The company has adhered to a six-month flagship release cycle for several years now. Since Samsung retired the Note in 2022, that’s meant new Galaxy S devices in January and February and Galaxy Z Folds and Galaxy Z Flips over the summer. Along with the change in scenery, however, it’s looking like the Samsung event will be packed.

Other expected headliners include a lot of time spent on Galaxy AI, which already got a good bit of face time at the Galaxy S24 event. That Unpacked also offered a “one more thing” in the form of the Galaxy Ring, which broke coverage a little over a month later at MWC in Barcelona. Expect some concrete info on the new wearable, along with other accessories like Galaxy Buds.

You can watch Galaxy Unpacked live here.

Samsung Galaxy Z Fold and Z Flip 6 updates

The Samsung Galaxy Z Fold and Galaxy Z Flip 6.
Image Credits: Brian Heater

Obviously both new foldables have already leaked, along with most of the devices discussed below. Samsung’s core competencies are manufacturing components, making great hardware and leaking all of the above early.

The company’s all-in commitment to the foldable form factor has changed a lot of minds over the last several years. Plenty of questions about reliability and consumer interest swirled around the first few generations, but it’s safe to say that Samsung has proved that foldables are, indeed, a viable category.

Both new models will be powered by the Snapdragon 8 gen 3 process Qualcomm unveiled in Hawaii last fall (I really need to talk to someone about my travel budget). The Galaxy S24 series was among the first devices to get the new system-on-a-chip, a list that also includes devices from OnePlus and Xiaomi, while Vivo introduced a foldable with the silicon back in March.

The Fold 6 is reportedly getting some key design tweaks, making it thinner and lighter than its predecessor, with an adjusted 22:9 aspect ratio. That would put it more in line with the very good foldable offerings from Google and OnePlus. Other rumors point to a brighter screen and a new aluminum shell, while the Galaxy Fold is said to be getting an upgraded battery.

Galaxy AI

Image Credits: Brian Heater

As you’ve no doubt heard, international law now requires that all phone manufacturers spend half of every event talking about how cool AI is. Samsung devoted a good bit of its winter Unpacked to the subject, showcasing a number of new camera features and its Circle to Search partnership with Google. Google picked up the mantle at I/O in May, while at WWDC last month, we learned that “AI” apparently stands for “Apple Intelligence” now.

Samsung’s biggest challenge here is finding ways to set Galaxy AI apart from the myriad Gemini features that are coming to Android. Details on those plans are a little hazy beyond things like a new translation feature for WhatsApp.

Galaxy Ring release date and price

Image Credits: Samsung

Samsung threw us a fun curve ball at the last Unpacked event with the brief announcement of a new wearable. With the arrival of Galaxy Ring, the company takes a step into a category that has thus far been dominated by smaller names like Oura. When the product was unveiled, all of those companies offered us comments along the lines of being happy that Samsung has effectively validated their space.

Oura CEO Tom Hale told TechCrunch at the time, in part, “New players entering the space is validation for the category and drives us to aim higher to serve our members and community.”

As for what that validation will look like, Samsung will finally deliver concrete details. The Ring should get a release date and price; rumors are suggesting a $300 ballpark. That would put the device in line with Oura, but the question remains how much of the cost Samsung might tie up in a premium health software subscription (also like Oura’s).

Other things we’ll be looking out for at Samsung Unpacked

Samsung Galaxy Watch
Image Credits: Brian Heater

Samsung’s earbud philosophy has always been the smaller, the better. For years now, the company has relied on a small, spherical design that sits flush against the ear. It does the job for the most part, but this limits control, as there isn’t much surface area to interact with. Naturally, the company is adding stems — and an oblong charging case.

If the leaks thus far are any indication, a lot of people are going to ask you if they’re AirPods next time you wear them out. Given that the copyright holder has struck down many of the images posted to Twitter/X, there’s a good sign that the leakers are onto something here.

Said leakers have also posted images of what purports to be the Galaxy Watch 7. Samsung also confirmed the new wearable is coming soon with a pre-show announcement of a new BioActive sensor. The company notes:

The all-new BioActive Sensor is essential to bringing you better preventative health experiences on the next Galaxy Watch, with design improvements that enable even more precise health insights. Samsung engineers focused on three upgrades to the new sensor: enhancing the performance of light-receiving photodiodes, adding additional colors of light-emitting diodes (LEDs), and arranging them optimally across the sensor.

Between the upgraded Watch and the new Ring, it’s safe to assume the company is going to spend plenty of time talking up its health and fitness platform. Given Apple’s patent struggles of late, Samsung might as well strike while the iron is hot.

TechCrunch will bring you the news as it happens. Galaxy Watch this space.

Samsung Unpacked 2024: What we expect and how to watch Wednesday's hardware event

Image Credits: Brian Heater

Samsung Unpacked 2024 kicks off Wednesday at 6 a.m. PT/9 a.m. ET. Why so early for our West Coast pals? The Galaxy device showcase is happening in Paris this year, putting the local start time at 1 p.m. CET. Paris, as luck would have it, is also kicking off the Summer Olympics roughly a fortnight later — an event for which Samsung happens to be a massive sponsor.

If past is precedent — and it always is with this stuff — foldables will take center stage at the Samsung event. The company has adhered to a six-month flagship release cycle for several years now. Since Samsung retired the Note in 2022, that’s meant new Galaxy S devices in January and February and Galaxy Z Folds and Galaxy Z Flips over the summer. Along with the change in scenery, however, it’s looking like the Samsung event will be packed.

Other expected headliners include a lot of time spent on Galaxy AI, which already got a good bit of face time at the Galaxy S24 event. That Unpacked also offered a “one more thing” in the form of the Galaxy Ring, which broke coverage a little over a month later at MWC in Barcelona. Expect some concrete info on the new wearable, along with other accessories like Galaxy Buds.

You can watch Galaxy Unpacked live here.

Samsung Galaxy Z Fold and Z Flip 6 updates

The Samsung Galaxy Z Fold and Galaxy Z Flip 6.
Image Credits: Brian Heater

Obviously both new foldables have already leaked, along with most of the devices discussed below. Samsung’s core competencies are manufacturing components, making great hardware and leaking all of the above early.

The company’s all-in commitment to the foldable form factor has changed a lot of minds over the last several years. Plenty of questions about reliability and consumer interest swirled around the first few generations, but it’s safe to say that Samsung has proved that foldables are, indeed, a viable category.

Both new models will be powered by the Snapdragon 8 gen 3 process Qualcomm unveiled in Hawaii last fall (I really need to talk to someone about my travel budget). The Galaxy S24 series was among the first devices to get the new system-on-a-chip, a list that also includes devices from OnePlus and Xiaomi, while Vivo introduced a foldable with the silicon back in March.

The Fold 6 is reportedly getting some key design tweaks, making it thinner and lighter than its predecessor, with an adjusted 22:9 aspect ratio. That would put it more in line with the very good foldable offerings from Google and OnePlus. Other rumors point to a brighter screen and a new aluminum shell, while the Galaxy Fold is said to be getting an upgraded battery.

Galaxy AI

Image Credits: Brian Heater

As you’ve no doubt heard, international law now requires that all phone manufacturers spend half of every event talking about how cool AI is. Samsung devoted a good bit of its winter Unpacked to the subject, showcasing a number of new camera features and its Circle to Search partnership with Google. Google picked up the mantle at I/O in May, while at WWDC last month, we learned that “AI” apparently stands for “Apple Intelligence” now.

Samsung’s biggest challenge here is finding ways to set Galaxy AI apart from the myriad Gemini features that are coming to Android. Details on those plans are a little hazy beyond things like a new translation feature for WhatsApp.

Galaxy Ring release date and price

Image Credits: Samsung

Samsung threw us a fun curve ball at the last Unpacked event with the brief announcement of a new wearable. With the arrival of Galaxy Ring, the company takes a step into a category that has thus far been dominated by smaller names like Oura. When the product was unveiled, all of those companies offered us comments along the lines of being happy that Samsung has effectively validated their space.

Oura CEO Tom Hale told TechCrunch at the time, in part, “New players entering the space is validation for the category and drives us to aim higher to serve our members and community.”

As for what that validation will look like, Samsung will finally deliver concrete details. The Ring should get a release date and price; rumors are suggesting a $300 ballpark. That would put the device in line with Oura, but the question remains how much of the cost Samsung might tie up in a premium health software subscription (also like Oura’s).

Other things we’ll be looking out for at Samsung Unpacked

Samsung Galaxy Watch
Image Credits: Brian Heater

Samsung’s earbud philosophy has always been the smaller, the better. For years now, the company has relied on a small, spherical design that sits flush against the ear. It does the job for the most part, but this limits control, as there isn’t much surface area to interact with. Naturally, the company is adding stems — and an oblong charging case.

If the leaks thus far are any indication, a lot of people are going to ask you if they’re AirPods next time you wear them out. Given that the copyright holder has struck down many of the images posted to Twitter/X, there’s a good sign that the leakers are onto something here.

Said leakers have also posted images of what purports to be the Galaxy Watch 7. Samsung also confirmed the new wearable is coming soon with a pre-show announcement of a new BioActive sensor. The company notes:

The all-new BioActive Sensor is essential to bringing you better preventative health experiences on the next Galaxy Watch, with design improvements that enable even more precise health insights. Samsung engineers focused on three upgrades to the new sensor: enhancing the performance of light-receiving photodiodes, adding additional colors of light-emitting diodes (LEDs), and arranging them optimally across the sensor.

Between the upgraded Watch and the new Ring, it’s safe to assume the company is going to spend plenty of time talking up its health and fitness platform. Given Apple’s patent struggles of late, Samsung might as well strike while the iron is hot.

TechCrunch will bring you the news as it happens. Galaxy Watch this space.

Great, now we have to become digital copyright experts

Image Credits: Nigel Sussman (opens in a new window)

When news broke last year that AI heavyweight OpenAI and Axel Springer had reached a financial agreement and partnership, it seemed to bode well for harmony between folks who write words, and tech companies that use them to help create and train artificial intelligence models. At the time OpenAI had also come to an agreement with the AP, for reference.

Then as the year ended the New York Times sued OpenAI and its backer Microsoft, alleging that the AI company’s generative AI models were “built by copying and using millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more.” Due to what the Times considers to be “unlawful use of [its] work to create artificial intelligence products,” OpenAI’s “can generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style, as demonstrated by scores of examples.”


The Exchange explores startups, markets and money.

Read it every morning on TechCrunch+ or get The Exchange newsletter every Saturday.


The Times added in its suit that it “objected after it discovered that Defendants were using Times content without permission to develop their models and tools,” and that “negotiations have not led to a resolution” with OpenAI.

Generative AI: Transforming education into a personalized, addictive learning experience

How to balance the need to respect copyright and ensure that AI development doesn’t grind to a halt will not be answered quickly. But the agreements and more fractious disputes between creators and the AI companies that want to ingest and use that work to build artificial intelligence models create an unhappy moment for both sides of the conflict. Tech companies are busy baking new generative AI models trained on data that includes copyright-protected material into their software products; Microsoft is a leader in that particular work, it’s worth noting. And media companies that have spent massively over time to build up a corpus of reported and otherwise created materials are incensed that their efforts are being subsumed into machines that give nothing back to the folks who provided their training data.

You can easily see the argument from either perspective. Tech companies crawl the internet already and have a history of collecting and parsing information for the sake of helping individuals navigate that data. Search engines, in other words. So why is AI training data any different? Media folks on the other hand have seen their own industry decline in recent years — most especially in the realm of journalism, where the Times is a heavyweight — and are loath to see another generation of tech products that depend on their work collect huge revenues while the folks who did the original work receive comparatively little, or in the case of AI training, often nothing.

We don’t need to pick a side here, though I am sure that both you and I have our own views that we could debate. Instead, this morning let’s take a look at some of the critical arguments in play in the AI data-training debate that are shaping how folks consider the issue. It’s going to be a critical issue in 2024. This will be educational for us both, and I think fun as well. To work!

The Times’ argument

The lawsuit is here, and is worth reading in its entirety. Clearly given its length, complete summary is impossible. But I want to highlight a few key points that matter.

The Times states that creating high-quality journalism is very expensive. That’s true. The Times also argues that copyright is critical for the protection of its work, and the functioning of its business model. Again, true.

Continuing, the Times notes that it has a history of licensing its materials to others. You can use its journalism, in other words, but you have to pay for that right from its perspective. The publication separates those arrangements from how its agreements with search engines function, writing: “While The Times, like virtually all online publishers, permits search engines to access its content for the limited purpose of surfacing it in traditional search results, The Times has never given permission to any entity, including Defendants, to use its content for GenAI purposes.”

Clear enough so far, right? Sure, but if LLMs are trained on oceans of data then why does it matter where any particular scrap came from? Can the Times point out clearly that its material was used in such a manner that it is being leaned on heavily to build a commercial product that others are selling without paying it for its inputs to that work?

The paper certainly thinks so. In its suit the Times notes that the “training dataset for GPT-2 includes an internal corpus OpenAI built called ‘WebText,’ which
includes ‘the text contents of 45 million links posted by users of the ‘Reddit’ social network.’” The Times is one of the leading sources used in that particular dataset. Why does that matter? Because OpenAI wrote that WebText was built to emphasize quality of material, per the suit. Put another way, OpenAI said that use of Times material in WebText and GPT-2 was to help make it better.

The Times then turns to WebText2, used in GPT-3, which was “weighted 22% in the training mix for GPT-3 despite constituting less than 4% of the total tokens in the training mix.” And in WebText2, “Times content—a total of 209,707 unique URLs—accounts for 1.23% of all sources listed in OpenWebText2, an open-source re-creation of the WebText2 dataset used in training GPT-3.”

Again, the Times is highlighting that even OpenAI agrees that its work was important to the creation of some of its popular models.

And Times material is well-represented in the CommonCrawl dataset, what the paper describes as the “most highly weighted dataset in GPT-3”. How much Times material is included in CommonCrawl? “The domain www.nytimes.com is the most highly represented proprietary source (and the third overall behind only Wikipedia and a database of U.S. patent documents) represented in a filtered English-language subset of a 2019 snapshot of Common Crawl, accounting for 100 million tokens,” it wrote.

The Times goes on to argue that similar uses of its material was likely in later GPT models built by OpenAI. Usage of Times material, and giving that used material extra weight thanks to its quality without paying for it, is what OpenAI will have to defend under fair use rules.

The Times argument I think boils down to “hey, you took our stuff to make your thing better, and now you are making tons of money off of it and that means you should pay us for what you took, used, and are still using today.” (This riff doesn’t include the Times argument that certain products that make use of AI models that were trained on its data are also cannibalizing its revenue streams by competing with its own, original work; as that argument is downstream from the model creation point, I consider it subsidiary to the above.)

The tech perspective

There was a discussion held by the U.S. Copyright Office last April that included representatives from the venture capital and technology industries, as well as rights holders. You can read a transcript here, which I heartily recommend.

Well-known venture firm a16z took part, arguing that “the overwhelming majority of the time, the output of a generative AI service is not ‘substantially similar’ in the copyright sense to any particular copyrighted work that was used to train the model.”

In the same block of remarks, a16z added that “the data needed [for AI model creation] is so massive that even collective licensing really can’t work. What we’re talking about in the context of these large language models is training on a corpus that is essentially the entire volume of the written word.” As we saw from the above noted Times arguments, it’s true that LLMs do ingest lots of stuff, but does not give it all equal weight. How that will impact the venture argument remains to be seen.

In an October comment again to the U.S. Copyright Office, the same venture firm argued that when “copies of copyrighted works are created for use in the development of a productive technology with non-infringing outputs, our copyright law has long endorsed and enabled those productive uses through the fair use doctrine,” without which search engines and online book search would not work. “[E]ach of these technologies involves the wholesale copying of one or many copyrighted works. The reason they do not infringe copyright is that this copying is in service of a non-exploitive purpose: to extract information from the works and put that information to use” to extend what it could originally do.

To a16z, AI model training is the same: “For the very same reason, the use of copyrighted works en masse to train an AI model—by allowing it to isolate statistical patterns and non-expressive information from those works—does not infringe copyright either. If the U.S. decides to impose “the cost of actual or potential copyright liability on the creators of AI models” it will “either kill or significantly hamper their development.”

Of course, this is an investor talking its book. But in the realm of tech advancement, sometimes a VC talking their book and arguing in favor of rapid technological innovation are one and the same. Summarizing the tech argument, it goes something like “there’s precedent for ingesting lots of data, included copyright-protected data, into tech products without paying for it and this is just that in a new suit.”

Another way to think about it

There’s an interesting question of scale afoot here. Tech thinker Benedict Evans, a former a16z partner it’s worth noting, dug into the thorny issues above, adding the following bit of cud for us to chew:

[O]ne way to think about this might be that AI makes practical at a massive scale things that were previously only possible on a small scale. This might be the difference between the police carrying wanted pictures in their pockets and the police putting face recognition cameras on every street corner – a difference in scale can be a difference in principle. What outcomes do we want? What do we want the law to be? What can it be? The law can change.

The Times and the tech industry are arguing current law. Evans points out that the scale of data ingestion for AI model creation could create a scenario when existing law might not fit what we want to have happen as a society. And that the law can change — provided that the nation’s elected officials can, in fact, still pass laws.

Summing, the Times argues with the receipts that its data was used more than other data in training certain OpenAI models because it was good. And since that material is copyrighted and used in particular, it should get paid. OpenAI and its backers and defenders are hoping that existing precedent and fair use legal protections are enough to keep their legal and financial liabilities low while they make lots of money with their new technologies. Finally, it’s also possible that we need new laws to handle situations like this, as what we have might not have the right scale in mind to handle what’s going on.

From where I sit, I don’t expect any OpenAI money to come to me for whatever it has ingested of my own writing. But I also don’t own most of it — my employers both current and historical do, and they have a lot more total material, and far greater legal resources to bring to bear along with the very same profit motive that the Times and OpenAI have. Perhaps I too will get dragged into this by proxy. That will make reporting on it all the more touchy. And hey, maybe that reporting itself will help future AI models explain to other people why they don’t have to pay for it.

This Week in AI: Can we (and could we ever) trust OpenAI?

pattern of openAI logo

Image Credits: Bryce Durbin / TechCrunch

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world of machine learning, along with notable research and experiments we didn’t cover on their own.

By the way, TechCrunch plans to launch an AI newsletter on June 5. Stay tuned. In the meantime, we’re upping the cadence of our semiregular AI column, which was previously twice a month (or so), to weekly — so be on the lookout for more editions.

This week in AI, OpenAI launched discounted plans for nonprofits and education customers and drew back the curtains on its most recent efforts to stop bad actors from abusing its AI tools. There’s not much to criticize, there — at least not in this writer’s opinion. But I will say that the deluge of announcements seemed timed to counter the company’s bad press as of late.

Let’s start with Scarlett Johansson. OpenAI removed one of the voices used by its AI-powered chatbot ChatGPT after users pointed out that it sounded eerily similar to Johansson’s. Johansson later released a statement saying that she hired legal counsel to inquire about the voice and get exact details about how it was developed — and that she’d refused repeated entreaties from OpenAI to license her voice for ChatGPT.

Now, a piece in The Washington Post implies that OpenAI didn’t in fact seek to clone Johansson’s voice and that any similarities were accidental. But why, then, did OpenAI CEO Sam Altman reach out to Johansson and urge her to reconsider two days before a splashy demo that featured the soundalike voice? It’s a tad suspect.

Then there’s OpenAI’s trust and safety issues.

As we reported earlier in the month, OpenAI’s since-dissolved Superalignment team, responsible for developing ways to govern and steer “superintelligent” AI systems, was promised 20% of the company’s compute resources — but only ever (and rarely) received a fraction of this. That (among other reasons) led to the resignation of the teams’ two co-leads, Jan Leike and Ilya Sutskever, formerly OpenAI’s chief scientist.

Nearly a dozen safety experts have left OpenAI in the past year; several, including Leike, have publicly voiced concerns that the company is prioritizing commercial projects over safety and transparency efforts. In response to the criticism, OpenAI formed a new committee to oversee safety and security decisions related to the company’s projects and operations. But it staffed the committee with company insiders — including Altman — rather than outside observers. This as OpenAI reportedly considers ditching its nonprofit structure in favor of a traditional for-profit model.

Incidents like these make it harder to trust OpenAI, a company whose power and influence grows daily (see: its deals with news publishers). Few corporations, if any, are worthy of trust. But OpenAI’s market-disrupting technologies make the violations all the more troubling.

It doesn’t help matters that Altman himself isn’t exactly a beacon of truthfulness.

When news of OpenAI’s aggressive tactics toward former employees broke — tactics that entailed threatening employees with the loss of their vested equity, or the prevention of equity sales, if they didn’t sign restrictive nondisclosure agreements — Altman apologized and claimed he had no knowledge of the policies. But, according to Vox, Altman’s signature is on the incorporation documents that enacted the policies.

And if former OpenAI board member Helen Toner is to be believed — one of the ex-board members who attempted to remove Altman from his post late last year — Altman has withheld information, misrepresented things that were happening at OpenAI and in some cases outright lied to the board. Toner says that the board learned of the release of ChatGPT through Twitter, not from Altman; that Altman gave wrong information about OpenAI’s formal safety practices; and that Altman, displeased with an academic paper Toner co-authored that cast a critical light on OpenAI, tried to manipulate board members to push Toner off the board.

None of it bodes well.

Here are some other AI stories of note from the past few days:

Voice cloning made easy: A new report from the Center for Countering Digital Hate finds that AI-powered voice cloning services make faking a politician’s statement fairly trivial.Google’s AI Overviews struggle: AI Overviews, the AI-generated search results that Google started rolling out more broadly earlier this month on Google Search, need some work. The company admits this — but claims that it’s iterating quickly. (We’ll see.)Paul Graham on Altman: In a series of posts on X, Paul Graham, the co-founder of startup accelerator Y Combinator, brushed off claims that Altman was pressured to resign as president of Y Combinator in 2019 due to potential conflicts of interest. (Y Combinator has a small stake in OpenAI.)xAI raises $6B: Elon Musk’s AI startup, xAI, has raised $6 billion in funding as Musk shores up capital to aggressively compete with rivals including OpenAI, Microsoft and Alphabet.Perplexity’s new AI feature: With its new capability Perplexity Pages, AI startup Perplexity is aiming to help users make reports, articles or guides in a more visually appealing format, Ivan reports. AI models’ favorite numbers: Devin writes about the numbers different AI models choose when they’re tasked with giving a random answer. As it turns out, they have favorites — a reflection of the data on which each was trained. Mistral releases Codestral: Mistral, the French AI startup backed by Microsoft and valued at $6 billion, has released its first generative AI model for coding, dubbed Codestral. But it can’t be used commercially, thanks to Mistral’s quite restrictive license. Chatbots and privacy: Natasha writes about the European Union’s ChatGPT taskforce, and how it offers a first look at detangling the AI chatbot’s privacy compliance.ElevenLabs’ sound generator: Voice cloning startup ElevenLabs introduced a new tool, first announced in February, that lets users generate sound effects through prompts.Interconnects for AI chips: Tech giants including Microsoft, Google and Intel — but not Arm, Nvidia or AWS — have formed an industry group, the UALink Promoter Group, to help develop next-gen AI chip components.

Great, now we have to become digital copyright experts

Image Credits: Nigel Sussman (opens in a new window)

When news broke last year that AI heavyweight OpenAI and Axel Springer had reached a financial agreement and partnership, it seemed to bode well for harmony between folks who write words, and tech companies that use them to help create and train artificial intelligence models. At the time OpenAI had also come to an agreement with the AP, for reference.

Then as the year ended the New York Times sued OpenAI and its backer Microsoft, alleging that the AI company’s generative AI models were “built by copying and using millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more.” Due to what the Times considers to be “unlawful use of [its] work to create artificial intelligence products,” OpenAI’s “can generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style, as demonstrated by scores of examples.”


The Exchange explores startups, markets and money.

Read it every morning on TechCrunch+ or get The Exchange newsletter every Saturday.


The Times added in its suit that it “objected after it discovered that Defendants were using Times content without permission to develop their models and tools,” and that “negotiations have not led to a resolution” with OpenAI.

Generative AI: Transforming education into a personalized, addictive learning experience

How to balance the need to respect copyright and ensure that AI development doesn’t grind to a halt will not be answered quickly. But the agreements and more fractious disputes between creators and the AI companies that want to ingest and use that work to build artificial intelligence models create an unhappy moment for both sides of the conflict. Tech companies are busy baking new generative AI models trained on data that includes copyright-protected material into their software products; Microsoft is a leader in that particular work, it’s worth noting. And media companies that have spent massively over time to build up a corpus of reported and otherwise created materials are incensed that their efforts are being subsumed into machines that give nothing back to the folks who provided their training data.

You can easily see the argument from either perspective. Tech companies crawl the internet already and have a history of collecting and parsing information for the sake of helping individuals navigate that data. Search engines, in other words. So why is AI training data any different? Media folks on the other hand have seen their own industry decline in recent years — most especially in the realm of journalism, where the Times is a heavyweight — and are loath to see another generation of tech products that depend on their work collect huge revenues while the folks who did the original work receive comparatively little, or in the case of AI training, often nothing.

We don’t need to pick a side here, though I am sure that both you and I have our own views that we could debate. Instead, this morning let’s take a look at some of the critical arguments in play in the AI data-training debate that are shaping how folks consider the issue. It’s going to be a critical issue in 2024. This will be educational for us both, and I think fun as well. To work!

The Times’ argument

The lawsuit is here, and is worth reading in its entirety. Clearly given its length, complete summary is impossible. But I want to highlight a few key points that matter.

The Times states that creating high-quality journalism is very expensive. That’s true. The Times also argues that copyright is critical for the protection of its work, and the functioning of its business model. Again, true.

Continuing, the Times notes that it has a history of licensing its materials to others. You can use its journalism, in other words, but you have to pay for that right from its perspective. The publication separates those arrangements from how its agreements with search engines function, writing: “While The Times, like virtually all online publishers, permits search engines to access its content for the limited purpose of surfacing it in traditional search results, The Times has never given permission to any entity, including Defendants, to use its content for GenAI purposes.”

Clear enough so far, right? Sure, but if LLMs are trained on oceans of data then why does it matter where any particular scrap came from? Can the Times point out clearly that its material was used in such a manner that it is being leaned on heavily to build a commercial product that others are selling without paying it for its inputs to that work?

The paper certainly thinks so. In its suit the Times notes that the “training dataset for GPT-2 includes an internal corpus OpenAI built called ‘WebText,’ which
includes ‘the text contents of 45 million links posted by users of the ‘Reddit’ social network.’” The Times is one of the leading sources used in that particular dataset. Why does that matter? Because OpenAI wrote that WebText was built to emphasize quality of material, per the suit. Put another way, OpenAI said that use of Times material in WebText and GPT-2 was to help make it better.

The Times then turns to WebText2, used in GPT-3, which was “weighted 22% in the training mix for GPT-3 despite constituting less than 4% of the total tokens in the training mix.” And in WebText2, “Times content—a total of 209,707 unique URLs—accounts for 1.23% of all sources listed in OpenWebText2, an open-source re-creation of the WebText2 dataset used in training GPT-3.”

Again, the Times is highlighting that even OpenAI agrees that its work was important to the creation of some of its popular models.

And Times material is well-represented in the CommonCrawl dataset, what the paper describes as the “most highly weighted dataset in GPT-3”. How much Times material is included in CommonCrawl? “The domain www.nytimes.com is the most highly represented proprietary source (and the third overall behind only Wikipedia and a database of U.S. patent documents) represented in a filtered English-language subset of a 2019 snapshot of Common Crawl, accounting for 100 million tokens,” it wrote.

The Times goes on to argue that similar uses of its material was likely in later GPT models built by OpenAI. Usage of Times material, and giving that used material extra weight thanks to its quality without paying for it, is what OpenAI will have to defend under fair use rules.

The Times argument I think boils down to “hey, you took our stuff to make your thing better, and now you are making tons of money off of it and that means you should pay us for what you took, used, and are still using today.” (This riff doesn’t include the Times argument that certain products that make use of AI models that were trained on its data are also cannibalizing its revenue streams by competing with its own, original work; as that argument is downstream from the model creation point, I consider it subsidiary to the above.)

The tech perspective

There was a discussion held by the U.S. Copyright Office last April that included representatives from the venture capital and technology industries, as well as rights holders. You can read a transcript here, which I heartily recommend.

Well-known venture firm a16z took part, arguing that “the overwhelming majority of the time, the output of a generative AI service is not ‘substantially similar’ in the copyright sense to any particular copyrighted work that was used to train the model.”

In the same block of remarks, a16z added that “the data needed [for AI model creation] is so massive that even collective licensing really can’t work. What we’re talking about in the context of these large language models is training on a corpus that is essentially the entire volume of the written word.” As we saw from the above noted Times arguments, it’s true that LLMs do ingest lots of stuff, but does not give it all equal weight. How that will impact the venture argument remains to be seen.

In an October comment again to the U.S. Copyright Office, the same venture firm argued that when “copies of copyrighted works are created for use in the development of a productive technology with non-infringing outputs, our copyright law has long endorsed and enabled those productive uses through the fair use doctrine,” without which search engines and online book search would not work. “[E]ach of these technologies involves the wholesale copying of one or many copyrighted works. The reason they do not infringe copyright is that this copying is in service of a non-exploitive purpose: to extract information from the works and put that information to use” to extend what it could originally do.

To a16z, AI model training is the same: “For the very same reason, the use of copyrighted works en masse to train an AI model—by allowing it to isolate statistical patterns and non-expressive information from those works—does not infringe copyright either. If the U.S. decides to impose “the cost of actual or potential copyright liability on the creators of AI models” it will “either kill or significantly hamper their development.”

Of course, this is an investor talking its book. But in the realm of tech advancement, sometimes a VC talking their book and arguing in favor of rapid technological innovation are one and the same. Summarizing the tech argument, it goes something like “there’s precedent for ingesting lots of data, included copyright-protected data, into tech products without paying for it and this is just that in a new suit.”

Another way to think about it

There’s an interesting question of scale afoot here. Tech thinker Benedict Evans, a former a16z partner it’s worth noting, dug into the thorny issues above, adding the following bit of cud for us to chew:

[O]ne way to think about this might be that AI makes practical at a massive scale things that were previously only possible on a small scale. This might be the difference between the police carrying wanted pictures in their pockets and the police putting face recognition cameras on every street corner – a difference in scale can be a difference in principle. What outcomes do we want? What do we want the law to be? What can it be? The law can change.

The Times and the tech industry are arguing current law. Evans points out that the scale of data ingestion for AI model creation could create a scenario when existing law might not fit what we want to have happen as a society. And that the law can change — provided that the nation’s elected officials can, in fact, still pass laws.

Summing, the Times argues with the receipts that its data was used more than other data in training certain OpenAI models because it was good. And since that material is copyrighted and used in particular, it should get paid. OpenAI and its backers and defenders are hoping that existing precedent and fair use legal protections are enough to keep their legal and financial liabilities low while they make lots of money with their new technologies. Finally, it’s also possible that we need new laws to handle situations like this, as what we have might not have the right scale in mind to handle what’s going on.

From where I sit, I don’t expect any OpenAI money to come to me for whatever it has ingested of my own writing. But I also don’t own most of it — my employers both current and historical do, and they have a lot more total material, and far greater legal resources to bring to bear along with the very same profit motive that the Times and OpenAI have. Perhaps I too will get dragged into this by proxy. That will make reporting on it all the more touchy. And hey, maybe that reporting itself will help future AI models explain to other people why they don’t have to pay for it.

Women in AI: Sarah Myers West says we should ask, 'Why build AI at all?'

Sarah Myers West

Image Credits: TechCrunch

To give AI-focused women academics and others their well-deserved — and overdue — time in the spotlight, TechCrunch has been publishing a series of interviews focused on remarkable women who’ve contributed to the AI revolution. We’re publishing these pieces throughout the year as the AI boom continues, highlighting key work that often goes unrecognized. Read more profiles here.

Sarah Myers West is managing director at the AI Now institute, an American research institute studying the social implications of AI and policy research that addresses the concentration of power in the tech industry. She previously served as senior adviser on AI at the U.S. Federal Trade Commission and is a visiting research scientist at Northeastern University, as well as a research contributor at Cornell’s Citizens and Technology Lab.

Briefly, how did you get your start in AI? What attracted you to the field?

I’ve spent the last 15 years interrogating the role of tech companies as powerful political actors as they emerged on the front lines of international governance. Early in my career, I had a front row seat observing how U.S. tech companies showed up around the world in ways that changed the political landscape — in Southeast Asia, China, the Middle East and elsewhere — and wrote a book delving in to how industry lobbying and regulation shaped the origins of the surveillance business model for the internet despite technologies that offered alternatives in theory that in practice failed to materialize.

At many points in my career, I’ve wondered, “Why are we getting locked into this very dystopian vision of the future?” The answer has little to do with the tech itself and a lot to do with public policy and commercialization.

That’s pretty much been my project ever since, both in my research career and now in my policy work as co-director of AI Now. If AI is a part of the infrastructure of our daily lives, we need to critically examine the institutions that are producing it, and make sure that as a society there’s sufficient friction — whether through regulation or through organizing — to ensure that it’s the public’s needs that are served at the end of the day, not those of tech companies.

What work are you most proud of in the AI field?

I’m really proud of the work we did while at the FTC, which is the U.S. government agency that among other things is at the front lines of regulatory enforcement of artificial intelligence. I loved rolling up my sleeves and working on cases. I was able to use my methods training as a researcher to engage in investigative work, since the toolkit is essentially the same. It was gratifying to get to use those tools to hold power directly to account, and to see this work have an immediate impact on the public, whether that’s addressing how AI is used to devalue workers and drive up prices or combatting the anti-competitive behavior of big tech companies.

We were able to bring on board a fantastic team of technologists working under the White House Office of Science and Technology Policy, and it’s been exciting to see the groundwork we laid there have immediate relevance with the emergence of generative AI and the importance of cloud infrastructure.

What are some of the most pressing issues facing AI as it evolves?

First and foremost is that AI technologies are widely in use in highly sensitive contexts — in hospitals, in schools, at borders and so on — but remain inadequately tested and validated. This is error-prone technology, and we know from independent research that those errors are not distributed equally; they disproportionately harm communities that have long borne the brunt of discrimination. We should be setting a much, much higher bar. But as concerning to me is how powerful institutions are using AI — whether it works or not — to justify their actions, from the use of weaponry against civilians in Gaza to the disenfranchisement of workers. This is a problem not in the tech, but of discourse: how we orient our culture around tech and the idea that if AI’s involved, certain choices or behaviors are rendered more ‘objective’ or somehow get a pass.

What is the best way to responsibly build AI?

We need to always start from the question: Why build AI at all? What necessitates the use of artificial intelligence, and is AI technology fit for that purpose? Sometimes the answer is to build better, and in that case developers should be ensuring compliance with the law, robustly documenting and validating their systems and making open and transparent what they can, so that independent researchers can do the same. But other times the answer is not to build at all: We don’t need more ‘responsibly built’ weapons or surveillance technology. The end use matters to this question, and it’s where we need to start.