The Coming Sources of Open Sources

By Colin Crowden

GPT models are on track to become the default gateway to information. As competition intensifies, the biggest models will not rely on open sources. They will
seek exclusive access to proprietary datasets. If that happens, the open internet will
shrink, search engines will fade, and OSINT will face the most significant transformation in its short history.

When GPT Replaces Search: The Coming Collapse of Open Sources

Imagine waking up one morning and realising that Google is no longer the front door
to the internet. The line of blue links has disappeared. The search bar still exists, but it
has become a formality because you and everyone else now turn directly to ChatGPT,
Gemini, Claude, or whichever model dominates.

This shift is already visible. On 30 October 2025, the Digital Watch Observatory
reported that Google removed the option to display more than 10 search results per
page.
Full citation: https://dig.watch/updates/google-limits-search-results-to-10-per-page

The change may appear modest, but it has major implications. It signals a strategic
pivot away from open discovery and towards curated, limited, and controlled access
to information.

If GPT models continue to replace traditional search, the pressure to secure exclusive
datasets will increase. The open internet will contract, and OSINT will enter a
fundamentally different era.

The Incentive Shift: From Open Web to Controlled Data

Search engines were built on openness. Their value depended on crawling as much of
the web as possible, indexing content, and monetising user intent through advertising.

LLMs operate differently. They thrive on exclusivity. A model trained on the same
open data as its competitors has no advantage. The real edge comes from:

Exclusive datasets
Private partnerships
Proprietary corpora unavailable to rivals
High fidelity data absent from the open web
Restricted sources that once appeared in search results but can no longer be accessed

This incentive structure is already changing the information landscape.

Google’s restriction benefits Gemini and disadvantages other LLMs

Google’s new limit reduces the discoverable surface of the web for all external LLMs
that depend on SERP outputs, long-tail URLs, and deep indexing. Meanwhile, Google’s own LLM, Gemini, retains full internal access to Google’s complete index, cached pages, and deep search capabilities.

Public access is reduced. Gemini remains in the loop.

This creates a competitive imbalance. Other LLMs lose access to long-tail information.
Gemini retains a privileged internal advantage. Google becomes both gatekeeper and
competitor. The long tail of the internet, where much of OSINT’s richest material
resides, is being hidden from everyone except Google’s own AI systems.

Warnings from AI Leaders: The Rise of the Dead Internet

The idea that the internet is becoming hollowed out or increasingly synthetic is no
longer fringe speculation. Prominent figures in AI and technology have publicly
referenced the concept of a "dead internet."

Sam Altman, CEO of OpenAI, said he now sees a large number of LLM run Twitter accounts and suggested that the Dead Internet Theory might be partly true. Full citation: https://www.independent.co.uk/bulletin/news/openai-sam-altman-dead-internet-theory-b2820388.html
Alexis Ohanian, co founder of Reddit, stated that much of the internet is now dead and argued that AI generated content and bot engagement are overwhelming human created spaces. Full citation: https://www.businessinsider.com/alexis-ohanian-much-of-the-internet-is-now-dead-2025-10
Analysts have also warned that online content may be moving towards a state where synthetic material dominates to the point that 99.9 per cent of the web could be AI generated. Full citation: https://www.galaxy.com/insights/perspectives/dead-internet-theory-collapse-online-truth

These warnings matter. When the leaders of major AI labs and foundational internet
platforms say the web is becoming artificial and opaque, it signals a structural shift
that directly affects OSINT.

A Wider Recognition: The AI Compute Race as a 2026 Risk

Control Risks has identified the global AI compute race as a strategic risk for 2026. Their assessment focuses on energy shortages, water usage, and infrastructure strain across hyperscale data centres.

This is only part of the picture. As compute becomes scarce, high quality and exclusive data becomes a strategic asset. Organisations will compete not only for GPUs and electricity but also for data access, licensing agreements, exclusive corpora, and proprietary insights.

This further incentivises the tightening of access to the open web. The AI arms race becomes as much about control of information as about access to silicon.

The Disappearing Web: A Structural Threat to OSINT

OSINT rests on one assumption. Somewhere on the web, there is publicly available information that can be found, indexed, and interrogated.

As the internet becomes more closed, that assumption weakens.

We are moving towards a world where:

websites allow crawling only by approved LLMs
niche publications and local news sources move behind paywalls
social platforms restrict APIs and block scraping
governments designate datasets as strategic or controlled assets
proprietary LLM ecosystems offer access only to trusted partners or
subscribers

OSINT will become more expensive, less democratic, and more reliant on controlled
systems.

Testing the Hypothesis

Test 1: Adoption Curve
LLM usage is rising rapidly, while traditional search has plateaued.
Result: supports the hypothesis.

Test 2: Commercial Incentives
Models differentiate through exclusive datasets rather than architecture.
Result: strongly supports the hypothesis.

Test 3: Platform Behaviour
Google’s restriction reduces visibility for external LLMs while keeping internal access
for Gemini.
Full citation: https://dig.watch/updates/google-limits-search-results-to-10-per-page
Result: supports the hypothesis.

Test 4: Growing AI Generated Content
Warnings from AI leaders about a dead internet suggest a landscape increasingly
dominated by synthetic content.
Result: strongly supports the hypothesis.

Test 5: OSINT Adaptability
OSINT can adapt but will face higher costs and more complex access requirements.
Result: supports the hypothesis.

Adapting OSINT: Leveraging LLMs in a Gated Future

OSINT practitioners will need to evolve rapidly.

1. Reverse engineering LLM prompts
LLMs will become information chokepoints. Analysts must infer:

likely training sources
dataset biases and gaps
hidden reasoning patterns
indications that information is synthetic
whether outputs imply access to exclusive corpora

The model becomes an intelligence target.

2. Extracting latent information
LLMs may reveal useful signals indirectly. Analysts will need structured methods for
probing, challenging, and validating these insights.

3. Fusing gated ecosystems
Future OSINT workflows will rely on:

paid data feeds
licensed databases
commercial intelligence platforms
sensor, geospatial, and archival sources
human networks and offline intelligence

Data fusion becomes central.

4. Using LLMs as investigative tools

LLMs will be used for:

hypothesis testing
anomaly detection
linguistic pattern analysis
lead generation
red teaming

OSINT shifts from open web discovery to controlled system interrogation.

What This Means for OSINT Teams

To remain effective, OSINT professionals will need to:

treat LLMs as both platforms and sources
build prompt interrogation and model assessment skills
subscribe to commercial intelligence feeds and paid databases
integrate multiple gated and proprietary ecosystems
diversify into geospatial, technical, and archival intelligence
learn how to detect and handle synthetic content
develop model auditing and bias analysis capabilities

The open web is shrinking. GPTs are becoming gatekeepers. OSINT is not disappearing, but it is transforming.

The analysts who thrive will be those who understand that in this new landscape,
effective investigation happens through the model, not around it.

Our Services,
powered by Fusion Hub

The Coming Sources of Open Sources

Automating OSINT Workflows...

Patriots in Birmingham...

All Articles

By Colin Crowden

Our Services, powered by Fusion Hub

The Coming Sources of Open Sources

Automating OSINT Workflows...

Patriots in Birmingham...

All Articles

By Colin Crowden

Our Services,
powered by Fusion Hub