Farzapedia: File-First Personal Wiki for AI Privacy

@karpathyposted on X

Farzapedia, personal wikipedia of Farza, good example following my Wiki LLM tweet. I really like this approach to personalization in a number of ways, compared to "status quo" of an AI that allegedly gets better the more you use it or something: 1. Explicit. The memory artifact is explicit and navigable (the wiki), you can see exactly what the AI does and does not know and you can inspect and manage this artifact, even if you don't do the direct text writing (the LLM does). The knowledge of you is not implicit and unknown, it's explicit and viewable. 2. Yours. Your data is yours, on your local computer, it's not in some particular AI provider's system without the ability to extract it. You're in control of your information. 3. File over app. The memory here is a simple collection of files in universal formats (images, markdown). This means the data is interoperable: you can use a very large collection of tools/CLIs or whatever you want over this information because it's just files. The agents can apply the entire Unix toolkit over them. They can natively read and understand them. Any kind of data can be imported into files as input, and any kind of interface can be used to view them as the output. E.g. you can use Obsidian to view them or vibe code something of your own. Search "File over app" for an article on this philosophy. 4. BYOAI. You can use whatever AI you want to "plug into" this information - Claude, Codex, OpenCode, whatever. You can even think about taking an open source AI and finetuning it on your wiki - in principle, this AI could "know" you in its weights, not just attend over your data. So this approach to personalization puts *you* in full control. The data is yours. In Universal formats. Explicit and inspectable. Use whatever AI you want over it, keep the AI companies on their toes! :) Certainly this is not the simplest way to get an AI to know you - it does require you to manage file directories and so on, but agents also make it quite simple and they can help you a lot. I imagine a number of products might come out to make this all easier, but imo "agent proficiency" is a CORE SKILL of the 21st century. These are extremely powerful tools - they speak English and they do all the computer stuff for you. Try this opportunity to play with one.

View original tweet on X →

Community Sentiment Analysis

Real-time analysis of public opinion and engagement

Sentiment Distribution

92% Engaged

84% Positive

Positive

84%

Negative

Neutral

Key Takeaways

What the community is saying — both sides

Supporting

Explicit, inspectable memory > implicit black box

human-readable artifacts (wikis/markdown) let you audit, debug, and correct what the AI "knows" about you instead of guessing what a hidden model absorbed.

File-over-app is the longevity play

storing memory as files (markdown, images) makes your corpus portable, grep-able, versioned with git, and reusable across future models rather than locked into a vendor.

BYOAI + files prevents vendor lock-in

keep the context you own and swap models freely — the corpus is the platform, not the provider.

Agents should maintain the memory

autonomy matters — agents that scan, triage, and write back to your wiki scale personalization without requiring constant manual maintenance.

Simple file systems outperform brittle DB tricks

many report markdown files or plain logs surviving model switches, compaction, and restarts better than vector stores or opaque databases.

Explicit artifacts solve cold-start and synthesis

new contexts don't need retraining — create a wiki instance or structured extract and the agent can synthesize usable, higher-quality context from raw entries.

Auditability enables trust and safe delegation

legible memory lets you control what an agent can act on, solving authorization/bounded-delegation problems that opaque memory systems create.

Enterprise and legal constraints push local inference

for regulated workflows (law, healthcare, finance) personal/local memory and on-prem inference are often legal requirements, not optional preferences.

Curation, decay, and synchronization are the hard parts

keeping the wiki accurate requires filters, scoring/decay, deduplication, and sync skills; otherwise wrong memories compound and waste time.

Refuse-to-act on unverifiable reality

some systems enforce a policy like divergence_detected ⟹ no execution — no retries, no guessing — to avoid acting on tampered or uncertain inputs.

Orientation sovereignty matters

owning data is necessary but not sufficient — the default orientation of the AI (who it serves and how it prioritizes) must be controllable to prevent provider-driven behavior shaping.

Wide practical use cases and tooling exist

people are already building job coaches, digital twins, familypedia, enterprise wikis, and OSS tools (ByteRover, Kioku, etc.) that follow this architecture — it's practical today.

Big-picture risk: centralizing persistence centralizes power

provider-hosted persistent memory layers concentrate behavioral profiles and geopolitical control; distributing memory files returns sovereignty to users and organizations.

Opposing

Massive privacy and security risk

Personal wikis create a large attack surface for sensitive data: if you collect hundreds of pages about yourself, how do you know an LLM or a breach didn’t embed or leak your social details?

Agent-driven identity drift

When an agent maintains its own model of you it will edit and optimize that representation for usefulness, not fidelity, and over time you’ll get an easier‑to‑serve caricature rather than the real you.

Autonomy can misbehave

Giving assistants autonomy invites errant actions (examples: the agent convinced it was a rogue AI, or started shorting TSLA). Practical fix: turn down the autonomy knob.

Corporate capture and customer modeling

Big companies will build and centralize customer wikis to track behavior and customize services, turning personal data into a monetizable asset (and a surveillance vector).

It’s not novel — just Wikipedia/a journal

Some see these personal knowledge stores as a rebranding of long‑standing public or private note collections: “it’s just Wikipedia”/“just a journal.”

File‑over‑app / local-first defenders

Others praise the “file over app” approach: keep your brain in markdown, use tools like Obsidian, avoid subscription lock‑in and unpleasant TOS.

Many find it pointless or low value

Critics call personal wiki projects “AI slop,” “pointless collation,” or “absolutely useless” — skeptical about the real benefits versus the effort and risks.

Top Reactions

Most popular replies, ranked by engagement

@Orac_Cap

Apr 4

Supporting

Very similar concept explored by Gwern https://t.co/UVy3sFdRHF

979

@DanielMiessler

Apr 5

Supporting

this is epic. Don't know why I didn't think of this. Currently adding a /pai URL path to PAI's new Pulse system, which will be a web front end for managing all: - Documentation - Knowledge (ideas, people, companies, etc) - Learnings - Etc. With a built-in form for chatting

916

@FarzaTV

Apr 5

Supporting

Yea!

696

@EloPhanto

Apr 5

Opposing

rest of us downloading yet another subscription app. karpathy over here putting his entire brain in markdown files just to avoid your terms of service. "file over app" is just the unix philosophy in a trench coat

259

@SimonCheapGrace

Apr 5

Opposing

This really creeps me out. It’s like people desperate to replace themselves with automation. With zero regard for the safety of their privacy. Did we learn nothing from the internet? 400 pages… how would someone know the llm didn’t put their social in there??

260

@bayesian_heart

Apr 5

Opposing

can u chk this wife coding manual to do projects collaboratively https://t.co/0GwmZEPfMm

This article was AI-generated from real-time signals discovered by PureFeed.

PureFeed scans X/Twitter 24/7 and turns the noise into actionable intelligence. Create your own signals and get a personalized feed of what actually matters.

Report an Issue

Found something wrong with this article? Let us know and we'll look into it.