Show HN: CoThou – Control what AI search engines say about your business

3 points by MartyD a day ago

I built CoThou after seeing search and AI answer engines give completely incorrect information about my company. Turns out, they prioritize structured, citable content, so I reverse-engineered how they choose sources and built CoThou to become the source of truth.

How it works For businesses: Create a company profile. When search and AI answer engines are asked about your company, they’ll cite your company profile and its content, not Wikipedia or outdated info.

For publishers and knowledge workers: Publish at your personal profile with proper citations (300M+ academic papers indexed). When someone asks search and AI answer engines about your topic, it will cite your work linking to your profile and allowing citation tracking.

Try it now (unlimited during beta): → https://cothou.com

It’s v0.01 and rough around the edges. Try it and let me know what breaks.

What’s next: Currently training a custom 32B MoE (Mixture of Experts) LLM with 3B active parameters scheduled to go live in Q1/2026. The key difference: it breaks down complex queries into parallel subtasks that execute live on an infinite canvas. You’ll see agents plan and build in real time, instead of waiting for a progress bar.

Examples: “Write a 300-page book on the history of computing” “Create a 60-second TikTok ad for my SaaS”

It handles research, outlines, storyboarding, asset generation, voice-overs, and music simultaneously.

Since only ~3B parameters are active per token, it runs 8–10× cheaper and faster than dense 32B models, while still matching or outperforming premium models on reasoning, coding, and long-context tasks.

Building through partnerships with NVIDIA Inception and Microsoft for Startups.

Would love HN feedback on: - Improving citation accuracy - Building trust with AI parsers - What sources to add next (currently 100M companies + 300M academic papers) - Anything else

Marty (Founder)

badmonster 13 hours ago

Fascinating approach to the AI search citation problem! I'm curious about the technical implementation - how do you ensure that AI search engines actually prioritize your structured company profiles over other sources? Are you using specific schema.org markup, or is there a more sophisticated method to influence their source selection?

Also, regarding your custom 32B MoE model - how do you handle the potential for conflicting information between user-submitted profiles and existing web sources? It seems like there could be interesting challenges around fact verification and maintaining source authority while still giving businesses control over their narrative.

MartyD 13 hours ago

Great questions! On getting AI engines to prioritize CoThou profiles: It's a combination of signals, not a single trick: Yes, schema.org (Organization, Person, Article schemas) plus JSON-LD. AI parsers love machine-readable structure. In addition Subdomain structure (company.cothou.com and john.cothou.com) creates clear attribution. I'm also working on verification badges (domain ownership, ORCID for researchers) to build trust and Semantic clarity, where I enforce consistent entity resolution (company names, people, topics). When an AI engine searches for "Acme Corp," it finds one authoritative, structured source instead of scattered mentions. It's quite complex but it works. Try "Search for Aiobis" for example to see how a verified company appears.
On the MoE model and conflicting information: You've hit the core challenge. My approach: CoThou doesn't replace fact-checking, it's a tool for presenting your version alongside existing sources. If someone asks ChatGPT about your company, ideally it will say: "According to their official CoThou profile with a link, they claim X. Other sources say Y." We're not trying to suppress conflicting info. We're giving businesses a canonical source so AI engines have something authoritative to cite in addition to Wikipedia, news, etc. For researchers: Academia already has this solved—peer review, citations, ORCID. We're just making that structured data accessible to AI parsers.
The harder problem is bad actors, someone could create a profile with false claims but i'm working on it: Right now, we rely on - requiring citations and - domain verification for businesses.
Long-term, we're exploring reputation scoring and community flagging. Does that answer it, or should I dig deeper into any part? —Marty

landgenoot 15 hours ago

I'm worried some bad actors are reverse engineering this as well.

MartyD 12 hours ago

You're absolutely right to be concerned, this is something I think about constantly.
The reality is: bad actors don't need to reverse-engineer anything. AI engines already prioritize structured, citable content. Anyone can spin up a website with schema.org markup and fake citations. The barrier is low.
What makes this hard to abuse at scale:
1. Domain verification – For businesses, we require proof of domain ownership. You can't claim to be Apple unless you control apple.com or an official subdomain or work at apple respectively having @apple.com business mail.
2. Citation requirements – Claims need links to primary sources. AI engines cross-reference. If your "citations" point to non-existent papers or contradict other sources, you lose authority fast.
3. Reputation signals – We're building verification badges (ORCID for researchers, business registries, etc.). Over time, verified profiles will rank higher.
But you've identified the fundamental tension: any system that makes it easier for legitimate businesses to be cited also makes it easier for bad actors. This is the same problem Google faced in the '90s, Wikipedia deals with daily, and AI engines are grappling with now.
Long-term solutions I'm exploring:
- Community flagging + reputation scoring - Integration with trust registries (DUNS, ORCID, Crossref DOIs) - Transparent edit histories (like Wikipedia)
The goal isn't to be manipulation-proof, nothing is. It's to make CoThou profiles more trustworthy than the alternatives (random blogs, SEO spam, outdated info).
What would you add? This is an evolving problem and I'd love HN's input. —Marty