Series of screens on blue background

Observe, the data observability platform, raises $115M with Snowflake investing

Series of screens on blue background

Image Credits: Andriy Onufriyenko / Getty Images

Enterprises today store and use data across an ever-growing number of applications and locations, making it challenging — if not impossible — to manage and query that data in a holistic way. That spells opportunity for startups building tools to stitch together that fragmentation, and today one of them — Observe — is announcing $115 million in funding on the heels of strong demand for its tech. The Series B values the startup at between $400 million and $500 million, sources tell TechCrunch. (Observe would not comment on the figure.)

Observe — not to be confused with Observe.AI — builds observability tools for machine-generated data that aims to break down data silos, useful for developers to understand how apps are working, being used, and potentially failing.

It was built from the ground up tightly integrated with the data-as-a-service giant Snowflake. Now, this strategic partner is becoming a strategic investor: Snowflake has joined the round alongside Series B lead Sutter Hill Ventures, alongside previous backers Capital One Ventures and Madrona.

The round is all equity, but part of it includes a conversion of previous debt that the company had raised (we covered a $50 million debt raise in October 2023). CEO Jeremy Burton said in an interview that the plan is to covert the remaining debt in an upcoming Series C.

Some further context on the startup’s valuation: while Observe would not comment on figure specifically for this story, it did note that it is “10x higher than the company’s Series A round four years ago.” Pitchbook data estimates that valuation was just under $35 million, which would make this round closer to $350 million (our sources, close to the company, say it’s in the range we list above).

What’s also worth pointing out is that the company has now raised some $205 million across different funding events. With a valuation that’s now potentially around $400 million, it means that investors own a large chunk of this company.

The major pressure on valuations, in fact, is part of the reason why Observe raised debt last year instead of an equity round: it meant no valuation haircuts.

“Two years ago, our valuation would have been $3 billion,” Burton said, a little wistfully. “Valuations have contracted hugely.”

This latest round speaks to a few significant currents in the market at the moment.

First, enterprises are very much under pressure to look for more cost-effective solutions for running their technology.

The promise of saving money is what drove enterprises to ditch on-premises software and head to the cloud en masse over the last decade, only to find themselves bedevilled by high administration and surprise-usage costs. Now the growth of platforms like Snowflake and Databricks are bringing those same complications to their data storage. Observe’s argument is that usage-based pricing can still work out to be a better way to control costs compared to an observability service that bases pricing primarily on ingested data.

Ingesting silos of semi-structured data into a unified “lake” as Observe does help cut down the time and effort — and thus cost — needed to query that data. The company charges primarily around queries rather than data ingestion, meaning companies pay for what they use.

Second, enterprises are looking to get more mileage out of their data. The main use case for Observe today is to analyze data to troubleshoot when an application is not working as it should be. Last year the company launched a generative AI tool that gives users nudges on what it can query for, and what’s coming up. That is inevitably also leading to customers using the tool for more than just troubleshooting into areas like marketing and security.

“You can also ingest security-related data or customer experience-related data,” Burton said. “In fact, we don’t care what the data is. It’s very permissive.” The company today works with third-parties to enhance that work but he doesn’t rule out native applications in these and other areas down the line.

As Snowflake continues to grow, it’s interesting that it’s choosing to invest in a partner building on its platform, rather than make a move into building (or acquiring) data observability tools to offer customers directly.

Stefan Williams, Snowflake’s VP of corporate development who runs Snowflake Ventures, said in an interview with TechCrunch that for now the company is seeing plenty of growth in its core database business. That means that a business like Observe is more attractive to it to help generate more activity on that front, alongside others in the same space. In other words, Snowflake doesn’t want to compete against or cannibalize the businesses of third parties that are driving more business and revenues to its platform overall.

“We see it as a lever to unlock new customers,” he said of the investment thesis of Snowflake Ventures. But in any event, investing in Observe becomes a tacit endorsement of it against other competitors in the space, which range from giants like Splunk through to other startups like Acceldata. “There is software and data observability. [In data,] there is nothing that competes with Observe right now,” Williams added.

The startup is not disclosing revenues but says that ARR is up 171% and net revenue retention is up 174% compared to a year ago. 

Big organge 'Road Work Ahead' sign

Cyvl.ai is bringing data-driven solutions to transportation infrastructure

Big organge 'Road Work Ahead' sign

Image Credits: Catherine McQueen / Getty Images

In the summer after his freshman year at Worcester Polytechnic Institute, an engineering school in Worcester, Massachusetts, Cyvl.ai co-founder and CEO Daniel Pelaez needed a job. He went home and worked at his local public works department, where he noted that there was very little software for tracking road repairs. He was told to go out, drive around, find issues and fix them.

“I was filling in potholes, fixing signs and cutting down trees. And during my time there, I quickly saw firsthand they had no data on anything,” Pelaez told TechCrunch. He saw an opportunity that would eventually become Cyvl.ai, a firm that helps municipalities and civil engineering firms bring a digital layer to tracking the conditions of transportation infrastructure.

Today the Boston-area startup announced a $6 million investment.

“Our core vision and why we started the company in the first place is to help the entire world build and maintain better transportation infrastructure,” he said. This covers roads, highways, sidewalks, airports and rail. Anyone from Boston certainly knows this is an area where the city could use a lot of help.

They are using sensors that can create a digital twin of the infrastructure piece such as a road, and then showing where there are weaknesses and predicting when there is likely to be a repair event. They do this using lidar, cameras and sensors, and combine this with their own data analytics and geospatial AI pipeline, he said.

“What we’re providing our end users, whether it’s civil engineering firms or governments, is better data on their transportation systems than they could ever have captured before and just helping them really be data driven when it comes to building and maintaining these very large-scale transportation systems,” Pelaez said.

He admits that selling to governments is not for the faint of heart, but the startup has figured out a way around the issues involved in dealing with municipalities. They learned that external civil engineering firms are often responsible for doing road surveys (or other transportation reviews) on behalf of the city or town, and they have begun partnering with them in a channel kind of relationship.

“Oftentimes, we’re really just relying on them to communicate to the government all the benefits of this technology, showing them that they were collecting it manually before, and we’re going to use this new technology to give them better data better and better visuals at the same cost, if not cheaper than what was already proposed in the contract,” he said.

The approach seems to be working with close to 200 cities and towns using their software to this point in just 2.5 years of operation, generating close to $2 million in annual recurring revenue (ARR). So the partnerships with these firms appear to be paying dividends. He says so far the chief competition has not been other companies doing something similar, but resistance to changing from manual processes to digital.

The company has an office in Somerville, Massachusetts, just outside of Boston, and currently has 11 employees, but they are hiring and he hopes to have 20 by the end of this year. He says as the son of an immigrant who came to the U.S. from Colombia with nothing, and as someone who was able to work his way through college, he is particularly cognizant of the need to build a diverse group of employees, and of the value of hard work.

The $6 million investment was led by Companyon Ventures with participation from Argon Ventures, AeroX Ventures and Alumni Ventures. Existing investors MassVentures, Launch Capital and RiverPark Ventures also participated in the round. The company has raised a total of $10 million.

PVML Team Photo

PVML combines an AI-centric data access and analysis platform with differential privacy

PVML Team Photo

Image Credits: PVML

Enterprises are hoarding more data than ever to fuel their AI ambitions, but at the same time, they are also worried about who can access this data, which is often of a very private nature. PVML is offering an interesting solution by combining a ChatGPT-like tool for analyzing data with the safety guarantees of differential privacy. Using retrieval-augmented generation (RAG), PVML can access a corporation’s data without moving it, taking away another security consideration.

The Tel Aviv-based company recently announced that it has raised an $8 million seed round led by NFX, with participation from FJ Labs and Gefen Capital.

Image Credits: PVML

The company was founded by husband-and-wife team Shachar Schnapp (CEO) and Rina Galperin (CTO). Schnapp got his doctorate in computer science, specializing in differential privacy, and then worked on computer vision at General Motors, while Galperin got her master’s in computer science with a focus on AI and natural language processing and worked on machine learning projects at Microsoft.

“A lot of our experience in this domain came from our work in big corporates and large companies where we saw that things are not as efficient as we were hoping for as naïve students, perhaps,” Galperin said. “The main value that we want to bring organizations as PVML is democratizing data. This can only happen if you, on one hand, protect this very sensitive data, but, on the other hand, allow easy access to it, which today is synonymous with AI. Everybody wants to analyze data using free text. It’s much easier, faster and more efficient — and our secret sauce, differential privacy, enables this integration very easily.”

Differential privacy is far from a new concept. The core idea is to ensure the privacy of individual users in large datasets and provide mathematical guarantees for that. One of the most common ways to achieve this is to introduce a degree of randomness into the dataset, but in a way that doesn’t alter the data analysis.

The team argues that today’s data access solutions are ineffective and create a lot of overhead. Often, for example, a lot of data has to be removed in the process of enabling employees to gain secure access to data — but that can be counterproductive because you may not be able to effectively use the redacted data for some tasks (plus the additional lead time to access the data means real-time use cases are often impossible).

Image Credits: PVML

The promise of using differential privacy means that PVML’s users don’t have to make changes to the original data. This avoids almost all of the overhead and unlocks this information safely for AI use cases.

Virtually all the large tech companies now use differential privacy in one form or another, and make their tools and libraries available to developers. The PVML team argues that it hasn’t really been put into practice yet by most of the data community.

“The current knowledge about differential privacy is more theoretical than practical,” Schnapp said. “We decided to take it from theory to practice. And that’s exactly what we’ve done: We develop practical algorithms that work best on data in real-life scenarios.”

None of the differential privacy work would matter if PVML’s actual data analysis tools and platform weren’t useful. The most obvious use case here is the ability to chat with your data, all with the guarantee that no sensitive data can leak into the chat. Using RAG, PVML can bring hallucinations down to almost zero and the overhead is minimal since the data stays in place.

But there are other use cases, too. Schnapp and Galperin noted how differential privacy also allows companies to now share data between business units. In addition, it may also allow some companies to monetize access to their data to third parties, for example.

“In the stock market today, 70% of transactions are made by AI,” said Gigi Levy-Weiss, NFX general partner and co-founder. “That’s a taste of things to come, and organizations who adopt AI today will be a step ahead tomorrow. But companies are afraid to connect their data to AI, because they fear the exposure — and for good reasons. PVML’s unique technology creates an invisible layer of protection and democratizes access to data, enabling monetization use cases today and paving the way for tomorrow.”

The United HealthCare Group Inc. logo on a laptop computer arranged

Change Healthcare stolen patient data leaked by ransomware gang

The United HealthCare Group Inc. logo on a laptop computer arranged

Image Credits: Tiffany Hagler-Geard / Bloomberg / Getty Images

An extortion group has published a portion of what it says are the private and sensitive patient records on millions of Americans stolen during the ransomware attack on Change Healthcare in February.

On Monday, a new ransomware and extortion gang that calls itself RansomHub published several files on its dark web leak site containing personal information about patients across different documents, including billing files, insurance records and medical information.

Some of the files, which TechCrunch has seen, also contain contracts and agreements between Change Healthcare and its partners.

RansomHub threatened to sell the data to the highest bidder unless Change Healthcare pays a ransom.

It’s the first time that cybercriminals have published evidence that they have in their possession medical and patient records from the cyberattack.

For Change Healthcare, there’s another complication: This is the second group to demand a ransom payment to prevent the release of stolen patient data in as many months.

UnitedHealth Group, the parent company of Change Healthcare, said there was no evidence of a new cyber incident. “We are working with law enforcement and outside experts to investigate claims posted online to understand the extent of potentially impacted data. Our investigation remains active and ongoing,” said Tyler Mason, a spokesperson for UnitedHealth Group.

What’s more likely is that a dispute between members and affiliates of the ransomware gang left the stolen data in limbo and Change Healthcare exposed to further extortion.

A Russia-based ransomware gang called ALPHV took credit for the Change Healthcare data theft. Then, in early March, ALPHV suddenly disappeared along with a $22 million ransom payment that Change Healthcare allegedly paid to prevent the public release of patient data.

An ALPHV affiliate — essentially a contractor who earns a commission on the cyberattacks they launch using the gang’s malware — went public claiming to have carried out the data theft at Change Healthcare, but that the main ALPHV/BlackCat crew stiffed them out of their portion of the ransom payment and vanished with the lot. The contractor said the millions of patients’ data was “still with us.”

Now, RansomHub says “we have the data and not ALPHV.” Wired, which first reported the second group’s extortion effort on Friday, cited RansomHub as saying it was associated with the affiliate that still had the data.

UnitedHealth previously declined to say whether it paid the hackers’ ransom, nor did it say how much data was stolen in the cyberattack.

The healthcare giant said in a statement on March 27 that it obtained a dataset “safe for us to access and analyze,” which the company obtained in exchange for the ransom payment, TechCrunch learned from a source with knowledge of the ongoing incident. UHG said it was “prioritizing the review of data that we believe would likely have health information, personally identifiable information, claims and eligibility or financial information.”

As the Change Healthcare outage drags on, fears grow that patient data could spill online

Multiple destinations. Gps tracking map. Track navigation pins on street maps, navigate mapping technology and locate position pin. Futuristic travel gps map or location navigator. Route distance data, path turns. destination tag, Vector

For Dataplor’s data intelligence tool, it’s all about location, location, location

Multiple destinations. Gps tracking map. Track navigation pins on street maps, navigate mapping technology and locate position pin. Futuristic travel gps map or location navigator. Route distance data, path turns. destination tag, Vector

Image Credits: Vadym Ivanchenko / Getty Images

If you want to get your product in a grocery store in Mexico City, Dataplor has global location intelligence to help you do that.

Founder and CEO Geoffrey Michener started the company in 2016 to index micro-businesses in emerging markets. The company raised $2 million in 2019 to bring Latin American food delivery vendors online.

Dataplor uses artificial intelligence, machine learning, large language models and a purpose-built technology platform to take in public domain data.

While that is not totally unique — there are companies like ThoughtSpot, Esri and Near doing something similar around business and location intelligence — Dataplor’s “secret sauce” is combining all of that technology and public domain data with a human factor. The company recruits and trains over 100,000 human validators, called Explorers, to validate all the data via computer. In addition, no personally identifiable information is used.

What results is answers to questions like “How many Taco Bell locations were opened across South America last year?” or “What percentage of Walmarts in Europe are located near a fast food restaurant?”

Dataplor raises $2M to digitize small businesses in Latin America

The company has since amassed more than 300 million point of interest (POI) records on over 15,000 brands — data like physical location, hours, contact information, whether they accept credit cards and consumer sentiment — in over 200 countries and territories.

Dataplor then licenses that data to companies in a wide variety of industries, including third-party logistics, real estate and finance, like American Express, Zettle and PayPal. More than 35 Fortune 500 brands already use Dataplor.

dataplor Close Rates Graphic
Dataplor’s location intelligence tool showing close rates. Image Credits: Dataplor

“Company 10-Ks are always six months late, so it’s hard to know if a company, for example, Starbucks, what their open or close rates are,” Michener told TechCrunch. “Other companies also want to know if one of their competitors closed or what the other businesses around there [are] so they can see if they can put a location there. We are trying to empower their decision-making.”

The company has also grown revenue by an average of 2.5x year-over-year since 2020 and is on track for profitability this year, Michener said.

Now the company wants to grow even faster, so Dataplor raised $10.6 million in Series A funding led by Spark Capital. Spark is known for early investments in Slack, Affirm, Postmates, Discord and Deel. The round also includes participation from Quest Venture Partners, Acronym Venture Capital, Circadian Ventures, Two Lanterns Venture Partners and APA Venture Partners. In total, the company has raised $20.3 million.

Dataplor intends to use the funding to make strategic hires and accelerate its sales and brand presence, Michener said.

For the Series A, Spark and Alex Finkelstein, the general partner who led the deal, “had a lot of conviction into what Dataplor was doing,” which was why Michener chose them to lead, he said. As part of the investment, Finkelstein joins Dataplor’s board of directors, which includes John Frankel, founding partner of ffVC.

“Alex saw the bigger picture, and he saw that while we’re not just a POI or places data company, we are helping people get somewhere or sell a product,” Michener said. “He said that by knowing everything about a business, and then across 100 million places, ‘That’s a really big opportunity. No one’s done that before.’ It really resonated, and if we share that same vision, we can use capital to grow and to grow efficiently and effectively, why not? Let’s go do it.”

Have a juicy tip or lead about happenings in the venture world? Send tips to Christine Hall at [email protected] or via this Signal link. Anonymity requests will be respected. 

Four things we learned when US intelligence chiefs testified to Congress

Brandywine Realty Trust says data stolen in ransomware attack

Image Credits: Getty Images / Edwin Remsberg

U.S. realty trust giant Brandywine Realty Trust has confirmed a cyberattack that resulted in the theft of data from its network.

In a filing with regulators on Tuesday, the Philadelphia-based Brandywine described the cybersecurity incident as unauthorized access and the “deployment of encryption” on its internal corporate IT systems, consistent with a ransomware attack.

Brandywine said the cyberattack caused disruption to the company’s business applications that support its operations and corporate functions, including its financial reporting systems.

The company said it shut down some of its systems and believes it has contained the activity. The company confirmed that hackers took files from its systems, but it was still investigating whether any sensitive or personal information was taken.

Brandywine is one of the largest real estate trusts (REIT) in the United States, with a portfolio of about 70 properties across Austin, Philadelphia, and Washington, DC, as of its last earnings report in April.

Some of the company’s biggest tenants reportedly include IBM, Spark Therapeutics, and Comcast.

Since the introduction of new rules in December, U.S. publicly traded companies are obliged to disclose to investors cybersecurity events that may have a material impact on the business. As of the filing, Brandywine said it does not believe the incident is “reasonably likely to materially impact” its operations.

Dell discloses data breach of customers' physical addresses

Michael Dell, Chairman and CEO of Dell Technologies, is speaking in front of the Dell logo during the ''New Strategies for a New Era'' keynote at the Mobile World Congress in Barcelona, Spain, on February 27, 2024. (Photo by Joan Cros/NurPhoto via Getty Images)

Image Credits: Joan Cros/NurPhoto / Getty Images

Technology giant Dell notified customers on Thursday that it experienced a data breach involving customers’ names and physical addresses.

In an email seen by TechCrunch and shared by several people on social media, the computer maker wrote that it was investigating “an incident involving a Dell portal, which contains a database with limited types of customer information related to purchases from Dell.”

Dell wrote that the information accessed in the breach included customer names, physical addresses and “Dell hardware and order information, including service tag, item description, date of order and related warranty information.” Dell did not say if the incident was caused by malicious outsiders or inadvertent error.

The breached data did not include email addresses, telephone numbers, financial or payment information, or “any highly sensitive customer information,” according to the company. 

The company downplayed the impact of the breach in the message.

“We believe there is not a significant risk to our customers given the type of information involved,” Dell wrote in the email.

When TechCrunch reached out to Dell for comment, asking specific questions such as how many customers were impacted, how the breach occurred and why the company considers that a breach of physical addresses does not pose “a significant risk” to customers, the company responded with a boilerplate version of the email it sent to affected customers. 

A Dell spokesperson, who declined to provide their name, later added: “We are not disclosing this specific information from our ongoing investigation.” Dell did not provide a reason.

On April 29, the website Daily Dark Web reported that someone on a hacking forum was advertising “customer and other information of systems purchased from Dell between 2017 and 2024.”

The person claimed that the dataset included information on 49 million people, and information such as full name, full address, the system’s service tag, customer number, and more. This is data that would align with what Dell disclosed was stolen. 

Dell’s spokesperson declined to comment on the forum post, and did not dispute the hacker’s claims.

UPDATE, May 9, 2:12 p.m. ET: This story was updated to add information about the hacking forum post.

Encrypted services Apple, Proton and Wire helped Spanish police identify activist

Reddit locks down its public data in new content policy, says use now requires a contract

reddit logo broken in half

Image Credits: TechCrunch

On Thursday, Reddit is rolling out a new policy aimed at balancing its desire to license its content to larger tech companies, like Google, and protecting users’ privacy. The newly announced “Public Content Policy” will now join Reddit’s existing privacy policy and content policy to guide how Reddit’s data is being accessed and used by commercial entities and other partners. Related to this, the company also announced a subreddit dedicated to researchers working with Reddit’s data.

The announcement comes shortly after Reddit’s stock market debut, which sees the company positioning itself to grow revenue not only from the ads that run on its platform and API usage by developers but also from its corpus of data. The company in its IPO prospectus said it had already made $203 million through data licensing agreements and expects that number to increase over time.

While Reddit hadn’t historically blocked access to its data for AI training purposes, it changed its course last year. Reddit CEO Steve Huffman told The New York Times that it didn’t make sense for Reddit to continue to give “all of that value to some of the largest companies in the world for free,” signaling the company’s plan to move into the data licensing space.

With those efforts now well underway, the new Public Content Policy will lock down access to Reddit’s data without an agreement. (Reddit says it’s not adding new restrictions, just publicizing the policy it’s had in place internally for some time.)

“Unfortunately, we see more and more commercial entities using unauthorized access or misusing authorized access to collect public data in bulk, including Reddit public content,” Reddit writes in its blog. “Worse, these entities perceive they have no limitation on their usage of that data, and they do so with no regard for user rights or privacy, ignoring reasonable legal, safety, and user removal requests. While we will continue our efforts to block known bad actors, we need to do more to restrict access to Reddit public content at scale to trusted actors who have agreed to abide by our policies. But we also need to continue to ensure that users, mods, researchers, and other good-faith, non-commercial actors have access.”

In other words, access to Reddit data for research and other non-commercial efforts will continue, but those entities that want to use Reddit’s data for other purposes — including for AI training — will have to pay. In a graphic shared on the blog, Reddit makes this clear, saying that businesses interested in using Reddit data to “power, augment or enhance your product for any commercial purposes” requires a contract.

Image Credits: Reddit

Advertisers, meanwhile, are directed to an ads API for managing campaigns and tracking their performance.

Because the company is essentially just a large website, indexable by search engines, this new policy aims to lock down Reddit content from any unauthorized collection while also respecting users’ rights.

For instance, Reddit says that its partners will have to upload users’ decisions to delete their content. So if users don’t want their personal posts to become fodder for future AI engines, they should be able to opt out. Partners are also restricted by the new policy from using Reddit’s content to identify individuals or their personal information, including for ad targeting. Partners also can’t use Reddit content to spam or harass its users or to conduct “background checks, facial recognition, government surveillance, or help law enforcement do any of the above.”

The policy additionally restricts access to adult media and clarifies that Reddit won’t sell its users’ personal information. The company also notes that it will never license non-public content like private messages or non-public account information, like users’ emails or browsing history, among other things.

To help researchers who want to use Reddit data for non-commercial purposes, the company has established a new subreddit, r/reddit4researchers. The company says it’s partnering with OpenMined to also develop a program to guide and grow researchers’ collaboration with Reddit.