Fast, comprehensive PDF malware scanning with instant results
PDFy is a powerful web service for analyzing PDF files for malicious content. It provides multiple interfaces (REST API, Web UI, TUI) for scanning PDFs and detecting malware, suspicious keywords, embedded scripts, and other security threats.
| Document | Description |
|---|---|
| Product Vision | Product scope, goals, and release intent |
| System Architecture | High-level system boundaries and components |
| Analysis Pipeline | Scan lifecycle and analysis stages |
| Document | Description |
|---|---|
| API Contracts | REST API surface and payload definitions |
| Scan Result Schema | Result data structures |
| Design Spec | Product design and implementation details |
| Production Plan | MVP implementation roadmap |
| Document | Description |
|---|---|
| Privacy & Security | Data retention, deletion, and privacy policies |
| Deployment Guide | Environment setup and deployment |
| ADR Decisions | Technical architecture decisions |
# Install dependencies
pip install -e ./services/analyzer
# Install API dependencies
pip install -e ./apps/api
cd apps/api
uvicorn app.main:app --host 0.0.0.0 --port 8000
Open apps/web/index.html in your browser for a user-friendly PDF analysis interface.
# Analyze a PDF file
curl -X POST http://localhost:8000/analyze/fast \
-F "file=@sample.pdf"
# If TUI is implemented
python -m apps.tui.main
| Endpoint | Method | Description |
|---|---|---|
/analyze/fast |
POST | Quick analysis with keyword/IOC detection |
/analyze/deep |
POST | Deep scan with PDF metadata and JavaScript analysis |
/health |
GET | API health check |
{
"file_name": "document.pdf",
"sha256": "abc123...",
"keyword_hits": ["Suspicious keyword"],
"iocs": {
"urls": ["http://example.com"],
"ips": ["192.168.1.1"],
"emails": ["test@example.com"]
},
"summary": {
"verdict": "suspicious",
"score": 65,
"confidence": "high"
}
}
# Run tests
pytest
# Run with coverage
pytest --cov
PDFy/
├── apps/
│ ├── api/ # FastAPI REST API
│ ├── web/ # Web interface (HTML/JS)
│ └── tui/ # Terminal interface
├── services/
│ └── analyzer/ # PDF analysis engine
├── docs/ # Documentation
└── README.md # This file
See Privacy & Security Documentation for:
See project repositories for licensing information.