The iPhone 16 launches today without its most hyped feature: Apple Intelligence

iPhone 16 Pro

Image Credits: Apple

The iPhone 16 officially goes on sale Friday. But for its earliest adopters, it arrives with a fundamental compromise baked into the deal.

Put simply, this is not the iPhone 16 that they were promised. Apple CEO Tim Cook said it would be the “first iPhone built for Apple Intelligence.” But that “for” is key: The handsets actually will not have its most hyped AI features out of the gate.

This feels like a turning point for Apple. When it comes to new features on phones, the company may not always be known for being the first to market or for jumping on gimmicks, but it is known for being the best. That’s not the case here. Apple was compelled to board the AI hype train and is thus taking a leap into the half-baked void.

Apple has now talked about its Apple Intelligence suite twice — first when announcing the AI suite at its Worldwide Developer Conference (WWDC) in June, and second during its September iPhone 16 launch.

iPhone 16 Pro Max review: A $1,200 glimpse at a more intelligent future

But in actuality, the company is far behind in terms of shipping features when it comes to its competitors like Google and Microsoft, as well as upstarts like OpenAI and Anthropic.

The company’s first set of AI tools, announced and released in developer betas, are rewriting tools, summarizations of articles and notifications, erasing objects in photos, and audio transcription. Much of this functionality already exists in the market. Apple’s bet is that its decisions around privacy — your usage data is not shared with other users, or with other tech companies, it promises — will be enough to attract buyers.

Strictly speaking, the gap between product and feature is not as dramatic as you might think — or at least that is how Apple would defend all this. The iPhone went on sale on September 20, and Apple has promised to start launching its AI features in October.

Yet only a handful of features will be made live at that time, and they will only be in U.S. English. (Recall that the company counts heavily on international markets, with North America accounting for just over half of all iPhone unit sales.)

And for the more complicated AI bells and whistles, we all still have to wait. The company plans to roll out features like visual search and Image Playground starting next month, while additional language support is starting to be rolled out in December — but first with localized English. Other languages will arrive sometime in 2025.

The iPhone 16 isn’t strictly necessary for those who want the new AI features. The company has already confirmed that the iPhone 15 Pro and 15 Pro Max will also get access to the platform.

So if Apple Intelligence is going to be a game changer as Apple promises, it’s fair to wonder if the rollout gaps and delays will keep users from upgrading. Or, if we start to see consumers adopt more of a wait-and-see approach — which might also translate to lower sales.

However, as my colleague Sarah Perez pointed out, Apple’s AI features could become more useful once third-party developers are able to fully integrate them with their apps. That’s worth considering, if and when it happens, but that’s more of a conversation for the iPhone 17.

That might well be the point here. Apple is building for the longer term opportunities, and for the first time, it feels like it’s asking buyers to take that leap of faith with it.

Tokens are a big reason today's generative AI falls short

LLM word with icons as vector illustration. AI concept of Large Language Models

Image Credits: Getty Images

Generative AI models don’t process text the same way humans do. Understanding their “token”-based internal environments may help explain some of their strange behaviors — and stubborn limitations.

Most models, from small on-device ones like Gemma to OpenAI’s industry-leading GPT-4o, are built on an architecture known as the transformer. Due to the way transformers conjure up associations between text and other types of data, they can’t take in or output raw text — at least not without a massive amount of compute.

So, for reasons both pragmatic and technical, today’s transformer models work with text that’s been broken down into smaller, bite-sized pieces called tokens — a process known as tokenization.

Tokens can be words, like “fantastic.” Or they can be syllables, like “fan,” “tas” and “tic.” Depending on the tokenizer — the model that does the tokenizing — they might even be individual characters in words (e.g., “f,” “a,” “n,” “t,” “a,” “s,” “t,” “i,” “c”).

Using this method, transformers can take in more information (in the semantic sense) before they reach an upper limit known as the context window. But tokenization can also introduce biases.

Some tokens have odd spacing, which can derail a transformer. A tokenizer might encode “once upon a time” as “once,” “upon,” “a,” “time,” for example, while encoding “once upon a ” (which has a trailing whitespace) as “once,” “upon,” “a,” ” .” Depending on how a model is prompted — with “once upon a” or “once upon a ,” — the results may be completely different, because the model doesn’t understand (as a person would) that the meaning is the same.

Tokenizers treat case differently, too. “Hello” isn’t necessarily the same as “HELLO” to a model; “hello” is usually one token (depending on the tokenizer), while “HELLO” can be as many as three (“HE,” “El,” and “O”). That’s why many transformers fail the capital letter test.

“It’s kind of hard to get around the question of what exactly a ‘word’ should be for a language model, and even if we got human experts to agree on a perfect token vocabulary, models would probably still find it useful to ‘chunk’ things even further,” Sheridan Feucht, a PhD student studying large language model interpretability at Northeastern University, told TechCrunch. “My guess would be that there’s no such thing as a perfect tokenizer due to this kind of fuzziness.”

This “fuzziness” creates even more problems in languages other than English.

Many tokenization methods assume that a space in a sentence denotes a new word. That’s because they were designed with English in mind. But not all languages use spaces to separate words. Chinese and Japanese don’t — nor do Korean, Thai or Khmer.

A 2023 Oxford study found that, because of differences in the way non-English languages are tokenized, it can take a transformer twice as long to complete a task phrased in a non-English language versus the same task phrased in English. The same study — and another — found that users of less “token-efficient” languages are likely to see worse model performance yet pay more for usage, given that many AI vendors charge per token.

Tokenizers often treat each character in logographic systems of writing — systems in which printed symbols represent words without relating to pronunciation, like Chinese — as a distinct token, leading to high token counts. Similarly, tokenizers processing agglutinative languages — languages where words are made up of small meaningful word elements called morphemes, such as Turkish — tend to turn each morpheme into a token, increasing overall token counts. (The equivalent word for “hello” in Thai, สวัสดี, is six tokens.)

In 2023, Google DeepMind AI researcher Yennie Jun conducted an analysis comparing the tokenization of different languages and its downstream effects. Using a dataset of parallel texts translated into 52 languages, Jun showed that some languages needed up to 10 times more tokens to capture the same meaning in English.

Beyond language inequities, tokenization might explain why today’s models are bad at math.

Rarely are digits tokenized consistently. Because they don’t really know what numbers are, tokenizers might treat “380” as one token, but represent “381” as a pair (“38” and “1”) — effectively destroying the relationships between digits and results in equations and formulas. The result is transformer confusion; a recent paper showed that models struggle to understand repetitive numerical patterns and context, particularly temporal data. (See: GPT-4 thinks 7,735 is greater than 7,926).

That’s also the reason models aren’t great at solving anagram problems or reversing words.

So, tokenization clearly presents challenges for generative AI. Can they be solved?

Maybe.

Feucht points to “byte-level” state space models like MambaByte, which can ingest far more data than transformers without a performance penalty by doing away with tokenization entirely. MambaByte, which works directly with raw bytes representing text and other data, is competitive with some transformer models on language-analyzing tasks while better handling “noise” like words with swapped characters, spacing and capitalized characters.

Models like MambaByte are in the early research stages, however.

“It’s probably best to let models look at characters directly without imposing tokenization, but right now that’s just computationally infeasible for transformers,” Feucht said. “For transformer models in particular, computation scales quadratically with sequence length, and so we really want to use short text representations.”

Barring a tokenization breakthrough, it seems new model architectures will be the key.

Tokens are a big reason today's generative AI falls short

LLM word with icons as vector illustration. AI concept of Large Language Models

Image Credits: Getty Images

Generative AI models don’t process text the same way humans do. Understanding their “token”-based internal environments may help explain some of their strange behaviors — and stubborn limitations.

Most models, from small on-device ones like Gemma to OpenAI’s industry-leading GPT-4o, are built on an architecture known as the transformer. Due to the way transformers conjure up associations between text and other types of data, they can’t take in or output raw text — at least not without a massive amount of compute.

So, for reasons both pragmatic and technical, today’s transformer models work with text that’s been broken down into smaller, bite-sized pieces called tokens — a process known as tokenization.

Tokens can be words, like “fantastic.” Or they can be syllables, like “fan,” “tas” and “tic.” Depending on the tokenizer — the model that does the tokenizing — they might even be individual characters in words (e.g., “f,” “a,” “n,” “t,” “a,” “s,” “t,” “i,” “c”).

Using this method, transformers can take in more information (in the semantic sense) before they reach an upper limit known as the context window. But tokenization can also introduce biases.

Some tokens have odd spacing, which can derail a transformer. A tokenizer might encode “once upon a time” as “once,” “upon,” “a,” “time,” for example, while encoding “once upon a ” (which has a trailing whitespace) as “once,” “upon,” “a,” ” .” Depending on how a model is prompted — with “once upon a” or “once upon a ,” — the results may be completely different, because the model doesn’t understand (as a person would) that the meaning is the same.

Tokenizers treat case differently, too. “Hello” isn’t necessarily the same as “HELLO” to a model; “hello” is usually one token (depending on the tokenizer), while “HELLO” can be as many as three (“HE,” “El,” and “O”). That’s why many transformers fail the capital letter test.

“It’s kind of hard to get around the question of what exactly a ‘word’ should be for a language model, and even if we got human experts to agree on a perfect token vocabulary, models would probably still find it useful to ‘chunk’ things even further,” Sheridan Feucht, a PhD student studying large language model interpretability at Northeastern University, told TechCrunch. “My guess would be that there’s no such thing as a perfect tokenizer due to this kind of fuzziness.”

This “fuzziness” creates even more problems in languages other than English.

Many tokenization methods assume that a space in a sentence denotes a new word. That’s because they were designed with English in mind. But not all languages use spaces to separate words. Chinese and Japanese don’t — nor do Korean, Thai or Khmer.

A 2023 Oxford study found that, because of differences in the way non-English languages are tokenized, it can take a transformer twice as long to complete a task phrased in a non-English language versus the same task phrased in English. The same study — and another — found that users of less “token-efficient” languages are likely to see worse model performance yet pay more for usage, given that many AI vendors charge per token.

Tokenizers often treat each character in logographic systems of writing — systems in which printed symbols represent words without relating to pronunciation, like Chinese — as a distinct token, leading to high token counts. Similarly, tokenizers processing agglutinative languages — languages where words are made up of small meaningful word elements called morphemes, such as Turkish — tend to turn each morpheme into a token, increasing overall token counts. (The equivalent word for “hello” in Thai, สวัสดี, is six tokens.)

In 2023, Google DeepMind AI researcher Yennie Jun conducted an analysis comparing the tokenization of different languages and its downstream effects. Using a dataset of parallel texts translated into 52 languages, Jun showed that some languages needed up to 10 times more tokens to capture the same meaning in English.

Beyond language inequities, tokenization might explain why today’s models are bad at math.

Rarely are digits tokenized consistently. Because they don’t really know what numbers are, tokenizers might treat “380” as one token, but represent “381” as a pair (“38” and “1”) — effectively destroying the relationships between digits and results in equations and formulas. The result is transformer confusion; a recent paper showed that models struggle to understand repetitive numerical patterns and context, particularly temporal data. (See: GPT-4 thinks 7,735 is greater than 7,926).

That’s also the reason models aren’t great at solving anagram problems or reversing words.

So, tokenization clearly presents challenges for generative AI. Can they be solved?

Maybe.

Feucht points to “byte-level” state space models like MambaByte, which can ingest far more data than transformers without a performance penalty by doing away with tokenization entirely. MambaByte, which works directly with raw bytes representing text and other data, is competitive with some transformer models on language-analyzing tasks while better handling “noise” like words with swapped characters, spacing and capitalized characters.

Models like MambaByte are in the early research stages, however.

“It’s probably best to let models look at characters directly without imposing tokenization, but right now that’s just computationally infeasible for transformers,” Feucht said. “For transformer models in particular, computation scales quadratically with sequence length, and so we really want to use short text representations.”

Barring a tokenization breakthrough, it seems new model architectures will be the key.

Tokens are a big reason today's generative AI falls short

LLM word with icons as vector illustration. AI concept of Large Language Models

Image Credits: Getty Images

Generative AI models don’t process text the same way humans do. Understanding their “token”-based internal environments may help explain some of their strange behaviors — and stubborn limitations.

Most models, from small on-device ones like Gemma to OpenAI’s industry-leading GPT-4o, are built on an architecture known as the transformer. Due to the way transformers conjure up associations between text and other types of data, they can’t take in or output raw text — at least not without a massive amount of compute.

So, for reasons both pragmatic and technical, today’s transformer models work with text that’s been broken down into smaller, bite-sized pieces called tokens — a process known as tokenization.

Tokens can be words, like “fantastic.” Or they can be syllables, like “fan,” “tas” and “tic.” Depending on the tokenizer — the model that does the tokenizing — they might even be individual characters in words (e.g., “f,” “a,” “n,” “t,” “a,” “s,” “t,” “i,” “c”).

Using this method, transformers can take in more information (in the semantic sense) before they reach an upper limit known as the context window. But tokenization can also introduce biases.

Some tokens have odd spacing, which can derail a transformer. A tokenizer might encode “once upon a time” as “once,” “upon,” “a,” “time,” for example, while encoding “once upon a ” (which has a trailing whitespace) as “once,” “upon,” “a,” ” .” Depending on how a model is prompted — with “once upon a” or “once upon a ,” — the results may be completely different, because the model doesn’t understand (as a person would) that the meaning is the same.

Tokenizers treat case differently, too. “Hello” isn’t necessarily the same as “HELLO” to a model; “hello” is usually one token (depending on the tokenizer), while “HELLO” can be as many as three (“HE,” “El,” and “O”). That’s why many transformers fail the capital letter test.

“It’s kind of hard to get around the question of what exactly a ‘word’ should be for a language model, and even if we got human experts to agree on a perfect token vocabulary, models would probably still find it useful to ‘chunk’ things even further,” Sheridan Feucht, a PhD student studying large language model interpretability at Northeastern University, told TechCrunch. “My guess would be that there’s no such thing as a perfect tokenizer due to this kind of fuzziness.”

This “fuzziness” creates even more problems in languages other than English.

Many tokenization methods assume that a space in a sentence denotes a new word. That’s because they were designed with English in mind. But not all languages use spaces to separate words. Chinese and Japanese don’t — nor do Korean, Thai or Khmer.

A 2023 Oxford study found that, because of differences in the way non-English languages are tokenized, it can take a transformer twice as long to complete a task phrased in a non-English language versus the same task phrased in English. The same study — and another — found that users of less “token-efficient” languages are likely to see worse model performance yet pay more for usage, given that many AI vendors charge per token.

Tokenizers often treat each character in logographic systems of writing — systems in which printed symbols represent words without relating to pronunciation, like Chinese — as a distinct token, leading to high token counts. Similarly, tokenizers processing agglutinative languages — languages where words are made up of small meaningful word elements called morphemes, such as Turkish — tend to turn each morpheme into a token, increasing overall token counts. (The equivalent word for “hello” in Thai, สวัสดี, is six tokens.)

In 2023, Google DeepMind AI researcher Yennie Jun conducted an analysis comparing the tokenization of different languages and its downstream effects. Using a dataset of parallel texts translated into 52 languages, Jun showed that some languages needed up to 10 times more tokens to capture the same meaning in English.

Beyond language inequities, tokenization might explain why today’s models are bad at math.

Rarely are digits tokenized consistently. Because they don’t really know what numbers are, tokenizers might treat “380” as one token, but represent “381” as a pair (“38” and “1”) — effectively destroying the relationships between digits and results in equations and formulas. The result is transformer confusion; a recent paper showed that models struggle to understand repetitive numerical patterns and context, particularly temporal data. (See: GPT-4 thinks 7,735 is greater than 7,926).

That’s also the reason models aren’t great at solving anagram problems or reversing words.

So, tokenization clearly presents challenges for generative AI. Can they be solved?

Maybe.

Feucht points to “byte-level” state space models like MambaByte, which can ingest far more data than transformers without a performance penalty by doing away with tokenization entirely. MambaByte, which works directly with raw bytes representing text and other data, is competitive with some transformer models on language-analyzing tasks while better handling “noise” like words with swapped characters, spacing and capitalized characters.

Models like MambaByte are in the early research stages, however.

“It’s probably best to let models look at characters directly without imposing tokenization, but right now that’s just computationally infeasible for transformers,” Feucht said. “For transformer models in particular, computation scales quadratically with sequence length, and so we really want to use short text representations.”

Barring a tokenization breakthrough, it seems new model architectures will be the key.

Tesla shareholders to vote today on $56B pay package

09 January 2023, Brandenburg, Schönefeld: New Model Y electric vehicles are parked in the early morning in a parking lot at Terminal 5 of the capital's Berlin-Brandenburg Airport. Due to space constraints on the site from the new plant of the U.S. electric car manufacturer Tesla in Grünheide, are several thousand new electric vehicles in the parking lots at the airport BER. Tesla currently employs, according to its own information, more than 7000 employees at the plant in Grünheide. Soon there will be 12000 employees at the Tesla Gigafactory Berlin-Brandenburg. With the planned expansion, there would then be even more. The current Tesla site covers around 300 hectares. Another 100 hectares are now being added. Photo: Patrick Pleul/dpa (Photo by Patrick Pleul/picture alliance via Getty Images)

Image Credits: Patrick Pleul/picture alliance / Getty Images

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility!

Kirsten Korosec is still off on holiday, so you’ve got me once again! Today is the big Tesla shareholder vote. We’ll finally get to learn, among other things, whether investors think Elon Musk deserves his $56 billion pay package. And if they don’t, will Musk make good on his threats and leave Tesla so he can focus on xAI or SpaceX or just X?!

You’ve probably seen the breathless and thirsty pleas from Tesla board members, shareholders and Musk himself asking investors to vote in favor of the compensation package. Musk and his supporters have claimed that he is owed such astronomical wealth because he hit whatever agreed-upon targets they say he hit. But before the shareholder vote commences, we think it’s worth a second look at why Chancellor Kathaleen McCormick struck down the “re-ratification” of the package in January. 

Her dominant theme? Musk holds so much sway over Tesla and its board of directors that there was no substantial negotiation when the company hammered out this deal with him in 2017-2018.

We’ll be watching and reporting, so check back in for updates. 

— Rebecca Bellan

A little bird

blinky cat bird green
Image Credits: Bryce Durbin

A little bird came to senior reporter Sean O’Kane and told him that Ford was poaching employees at EV startup Canoo. That led O’Kane down the rabbit hole of LinkedIn, where he learned Ford has been quite busy scooping up talent across competitors and tech giants like Rivian, Tesla, Formula 1 and Apple. The upshot? Ford is capitalizing on the recent chaos in the auto industry to build out its secretive low-cost EV team. 

Got a tip for us? Email Kirsten Korosec at [email protected], Sean O’Kane at [email protected] or Rebecca Bellan at [email protected]. Or check out these instructions to learn how to contact us via encrypted messaging apps or SecureDrop.

Deals!

money the station
Image Credits: Bryce Durbin

General Motors is giving Cruise an $850 million lifeline as the robotaxi company slowly starts re-entering markets. The money is sort of a bridge to help Cruise fund its daily operations until it can “find the right long-term capital efficient strategy, including new partnerships and external funding.” 

While this does seem like a large capital outlay, particularly since GM has already invested over $8 billion in Cruise, that $850 million is hundreds of millions lower than the automaker likely would have spent on Cruise this year. GM told investors at the end of last year that it would slash spending on Cruise following a series of safety incidents that culminated in Cruise grounding its entire fleet in November 2023.

Other deals that got my attention …

Descartes, a logistics company, has acquired BoxTop Technologies, a provider of shipment management solutions for small- to midsized logistics service providers for $13 million, the company told TechCrunch via email.

FLO, an EV charging network operator, has secured $136 million in long-term capital, mainly from a Series E round led by Export Development Canada. The funds will help FLO accelerate deployment of its charging network in the U.S. and Canada and advance the rollout of its new charging products, according to a statement from the startup. 

Fly E-Bike, a NYC-based e-bike retailer focused on last-mile food delivery, went public on the Nasdaq last week at an opening price of $4.78 per share. The stock has since dipped nearly 20%.

General Motors’ board approved a new share buyback plan to repurchase up to $6 billion of outstanding common stock to create more upside for investors, according to a press release. The automaker also increased its stock dividend 33%, from $0.09 to $0.12 per share, in the first quarter. 

Tern AI’s positioning system aims to provide a low-cost alternative to GPS. The startup just raised a $4.4 million seed round from Scout Ventures, Shadow Capital, Bravo Victor VC and Veteran Fund.

Belgian transport and logistics management startup Qargo has raised $14 million in a Series A led by Balderton Capital. Qargo’s goal is to help the logistics industry digitize to create efficiencies, reduce costs and bring down emissions. 

Notable reads and other tidbits

Autonomous vehicles

Aside from scoring funding from GM, Cruise also announced that it’s back in Houston with a small fleet of vehicles. They will have a human safety driver behind the wheel as Cruise slowly and painstakingly revalidates its technology. 

May Mobility will launch an autonomous shuttle service in Detroit for residents with disabilities or who are 65 and older. Starting June 20, May will deploy three AVs, two of which are wheelchair-accessible, to help participants get to healthcare facilities, shopping centers, work and social activities.  

Waymo issued its second recall after one of its robotaxis in Phoenix drove into a telephone pole. Waymo updated the software on its over 600 vehicles as they returned to the depot for regular maintenance and recharging.

Electric vehicles, charging & batteries

Aptera is the perfect example of how crowdfunding can go wrong. The startup raised over $100 million from retail investors who were reeled in by flashy social media campaigns. So far, Aptera has nothing to show for it. 

The European Commission has concluded that the Chinese EV industry suffers from unfair subsidization that makes it a threat to EU EV producers. So it’s slapping more tariffs on Chinese EVs. For example, BYD would see a 17.4% duty, Geely’s would be 20%, and SAIC’s a whopping 38.1%.

Well, we certainly saw this coming. Fisker has issued its first recall for its troubled Ocean SUV due to problems with the warning lights. The recall comes as Fisker is on the brink of bankruptcy and has cut its workforce to the bone. 

Rivian at first seemed like it was a little all over the place. But with the revamped version of its first two consumer vehicles (which you can read more about below!) that are built more efficiently and Rivian’s decision to set aside plans to build a factory in Georgia, the EV startup’s path to survival is becoming more clear.

Ride-hail 

Revel launched an all-Tesla, all-employee ride-hail service in New York City in 2021. Now the company is ditching the employee model in favor of a gig worker model akin to competitors Lyft and Uber. In the process, Revel will have to lay off its 1,000 driver employees. 

This week’s wheels

rivian next-gen-r1s-r1t
Image Credits: Kirsten Korosec

What is “This week’s wheels”? It’s a chance to learn about the different transportation products we’re testing, whether it’s an electric or hybrid car, an e-bike or even a ride in an autonomous vehicle. 

Before Kirsten went on holiday, she gave Rivian’s revamped R1T and R1S a drive. Let’s see what she learned. 

I recently got behind the wheel of the next-generation Rivian R1T and R1S. For those who want a deep dive, you can read my story about why this refresh matters and what new parts and features customers can expect. 

For folks looking for the highlights, here are a few quick notes on some important changes. 

The ride quality, specifically the road noise and suspension, are much improved in the next-gen R1 line. The ride quality in the first-generation R1S has been a source of complaints, so this improvement couldn’t not have come at a better time. 

I also had a chance to test the advanced driver-assistance system. First-gen R1 vehicles also have ADAS, but this one is supported by lots of new sensors, compute and software. The automaker is calling it the “Rivian Autonomy Platform,” but I won’t be because I’m a stickler for what is and what is not autonomy. This is a hands-on system for now and the driver is always expected to be in control. 

That being said, there is a lot of capability here. The perception stack includes 11 cameras, five radars, compute that is 10 times more powerful than what’s in the first-gen vehicles and what the company describes as “AI prediction technology.” And this system comes standard. 

Rivian will also offer a premium version of the system, called Rivian Autonomy Platform+. I had a chance to test one feature in this premium version called “Lane Change on Command.” The feature, which is only available on divided highways, will move the vehicle into another lane if the driver hits the signal indicator. The feature worked, although I noticed that it took a moment for the vehicle to settle into the center of the lane. 

Rivian says more automated driving features will roll out on this premium system in the months to come. 

Correction: Waymo has clarified that its recall did not involve an over-the-air software update.