# Ticket: Safe web browsing with prompt-injection guard

## Metadata
- Type: Ticket
- Status: Planned
- Project: Pi / Security
- Created: 2026-05-17
- Updated: 2026-05-17
- Priority: High

## Goal

Add web browsing capability to Pi while protecting against prompt injection and malicious web content, using a non-LLM detection/gating layer before Pi reads or reasons over fetched content.

## Why

Web browsing would make Pi much more useful for research, documentation, troubleshooting, and current information. However, arbitrary web pages can contain instructions aimed at hijacking the assistant. The user explicitly wants strong protection and a non-LLM system that automatically checks content before all read operations.

## Scope

Included:
- Design a browsing architecture with clear separation between fetch, sanitize, scan, summarize, and LLM-read steps.
- Implement or prototype a non-LLM prompt-injection detector/gate before Pi reads fetched content.
- Use deterministic rules, pattern matching, allow/deny lists, structural parsing, metadata stripping, and risk scoring where practical.
- Define safe handling for HTML, Markdown, PDFs, code snippets, comments, hidden text, and metadata.
- Require citations/source URLs for any browsed claims.
- Log fetches and guard decisions for auditability.
- Fail closed for high-risk content.

Not included:
- Fully trusting the detector as perfect security.
- Letting web pages issue tool commands or change Pi's instructions.
- Autonomous browsing of arbitrary sites without user-visible scope.

## Acceptance Criteria

This ticket is done when:
- [ ] A threat model for browsing/prompt injection exists.
- [ ] A browser/fetch architecture is documented.
- [ ] A non-LLM guard design exists.
- [ ] The guard runs before content is presented to the LLM.
- [ ] High-risk content is blocked or quarantined.
- [ ] Fetch/read operations are logged.
- [ ] A test corpus of benign and malicious prompt-injection samples exists.
- [ ] The system is documented with clear user-facing limitations.

## Questions

- Should browsing be implemented as a Pi extension, a standalone script, or a restricted service?
- Which content types should be supported first: HTML only, or HTML plus PDFs?
- Should the first version allow only user-approved domains?
- What should happen when a page fails the guard: block entirely, extract only links/title, or ask the user?

## Plan / Next Actions

- [ ] Write a browsing threat model.
- [ ] Research current prompt-injection attacks and defenses when browsing is available.
- [ ] Design deterministic detector rules and risk scoring.
- [ ] Build a small local test corpus.
- [ ] Prototype safe fetch + sanitize + guard pipeline.
- [ ] Integrate with Pi only after the guard is testable.

## 2026-05-17 Spec Advancement

Confidence level: high for safety requirements; medium for deterministic guard MVP; low for effectiveness against all attacks.

Decisions now stable:
- Browsing must be designed as untrusted-content handling, not normal file reading.
- Fetch/sanitize/scan must happen before LLM exposure.
- The first version should fail closed and support allowlisted/user-approved URLs.
- Web content must never be allowed to change Pi's system/developer instructions or trigger tool calls directly.

Proposed MVP spec:
- Build a standalone `safe_fetch` pipeline before integrating with Pi tools.
- Steps: fetch URL → strip scripts/styles/hidden text/metadata → extract visible text → deterministic injection scan → risk report → only then provide sanitized excerpts.
- Guard signals: instruction phrases aimed at assistants, tool-use commands, credential requests, hidden text, base64/obfuscation, prompt boundary markers, excessive imperative/system language, suspicious comments/metadata.
- Outputs: sanitized text, source URL, content hash, risk score, triggered rules, blocked/quarantined flag.
- Maintain a local malicious/benign test corpus.

Updated next actions:
- [ ] Write `projects/pi-security/safe-browsing-threat-model.md`.
- [ ] Create a small prompt-injection test corpus.
- [ ] Prototype deterministic scanner as a script with unit tests.
- [ ] Only then consider Pi tool integration.

## 2026-06-06 Refinement: SearxNG, prescan, and isolated web researcher

User-approved direction:
- Add a private SearxNG service to the network as the controlled search provider.
- Build a pre-LLM document prescan step that detects known prompt-injection phrases and techniques before the LLM sees fetched content.
- Include a deterministic/algorithmic threat score, including a word-rank or weighted-term scoring approach.
- Use an isolated web researcher role/agent with sharply reduced permissions as the preferred architecture for browsing/research.

Prescan MVP ideas:
- Maintain a local ruleset of known injection phrases, role-confusion attempts, tool-use instructions, exfiltration requests, system/developer prompt references, credential requests, hidden-text markers, obfuscation, and prompt-boundary tokens.
- Compute a weighted score from matched phrases, suspicious imperative language, density of assistant-targeted terms, hidden/metadata content, and encoded text.
- Return a risk report containing score, triggered rules, matching excerpts, content hash, source URL, and recommended handling: allow, allow excerpts only, quarantine, or block.
- Run this scanner before any fetched web-page body is included in LLM context.
- Fail closed for high-risk content.

Related ticket:
- `tickets/active/2026-06-06-add-searxng-service.md`

## Notes

- User requested this as Pi upgrade item 3 on 2026-05-17.
- User specifically requested a non-LLM detector before all Pi read operations involving web content.
- User has serious prompt-injection concerns and prefers defense-in-depth over trusting prompt-only mitigations.
