PDF Penguin
AI-powered PDF to JSON conversion for structured, usable data
A comprehensive document processing tool that transforms unstructured PDF data into clean, structured JSON for developers and data analysts.

Overview
TL;DR
- Problem: Non‑technical users struggle to extract structured data from PDFs quickly.
- Solution: Drag‑drop PDF → prompt desired structure → instant JSON output.
- Success criteria: time‑to‑first‑output ≤ 10s, ≥ 90% parse success for common docs, clear fallback for scans.
Constraints
- Solo project; limited time box for v1.
- Mixed‑quality inputs (scans vs. digital PDFs).
- Model/OCR variability; must guide user prompts.
Collaboration & Feedback
- Peer dev feedback on output schema clarity → added copy and examples.
- Early testers (friends/Discord) struggled with vague prompts → added placeholder guidance.
- Iterated UX on empty/error states after scan failures.
While developing my cooking assistant app, Chefie, I needed a way to extract structured ingredient and nutrition data from USDA PDFs. The datasets were available, but they were formatted as complex, unstructured PDFs that were difficult to work with programmatically. Existing tools were unreliable or too technical, so I built a clean, AI-powered tool for parsing and exporting PDF data into usable JSON.
PDF Penguin has since evolved into a standalone product with broader application across document-heavy industries.
My Role
This was a completely solo project, where I handled every aspect from product vision to deployment.
- Founder
- Product Designer (UX & UI)
- Frontend Developer (React, TailwindCSS)
- AI Integration (Vision Models)
Problem & Goal
The biggest issue with existing PDF parsers was that they were overly technical — requiring specialized setup, manual formatting, command-line usage, or developer-only integrations. Tools like Tabula and Adobe's OCR exports were powerful but inaccessible to non-technical users. Many required users to predefine table structures or fiddle with JSON schemas before seeing results, which added friction for those just trying to extract usable information from documents.
Additionally, tools often failed with scanned documents or image-based PDFs, offering inconsistent or incomplete results. Even when they worked, the interfaces were cluttered and required unnecessary steps or downloads.
I set out to design a tool that required zero onboarding: drag, drop, type what you want — and get clean JSON instantly. No setup. No training. Just output.
Design Process
Step 1: Empathize & Research
I explored a range of existing PDF parsing tools including Tabula, Adobe Acrobat's OCR export, and DocParser. While all three were technically capable, they presented significant barriers for non-technical users — requiring either installation, rule-building, or an understanding of export settings and schemas. I tested each one by attempting to extract structured data without relying on documentation or setup guides, simulating the experience of a first-time user with minimal technical background.
Across the board, I encountered slow onboarding, confusing interfaces, and results that required multiple adjustments or retries. Even simple use cases like "extract table data" demanded upfront learning or configuration. These pain points reinforced a clear gap in the space: a need for a tool that provides structure, flexibility, and results — without setup or specialized knowledge.
The table below summarizes how PDF Penguin compares to these tools, based on that usability-first evaluation.

Step 2: Define
The research made one thing clear: the biggest barrier wasn't technical capability — it was usability. Most tools assumed the user had experience with templates, schemas, or parsing rules. I defined the core product need as creating a parsing tool that eliminated setup entirely. PDF Penguin would focus on a single principle: let users describe what they want in plain language, and deliver results instantly — simple to use, with no learning curve.
Step 3: Ideate

I had a clear mental model of how the product should behave, so I quickly sketched a basic 2-panel layout idea: Upload (left) → Output (right), supported by a flexible prompt box to direct the AI. The goal was instant clarity, minimal onboarding, and the ability to adjust the output on the fly.
Step 4: Prototype & Design
With a clear layout already in mind, I moved directly into high-fidelity interface development using Vercel. Rather than spending time on static design tools, I focused on live iteration — adjusting spacing, labels, and user flows in context. This allowed me to make real-time decisions based on how the interface behaved, rather than just how it looked. Cursor supported this process by helping structure responsive components and quickly refine interactive behaviors.
Step 5: Test & Iterate
Each time I implemented a new UI or prompt behavior, I tested it by uploading different document types — invoices, receipts, reports — and refining the prompt UX to guide the AI parser. After discovering poor output from vague prompts, I added a customizable instruction field and clarified the placeholder text to guide user input.
Once the interface flow and AI behavior were reliable, I built the full app in Cursor with a React + TailwindCSS frontend, integrated OCR and OpenAI APIs, and deployed it via Vercel. The result is a working product with real users, capable of turning even messy PDFs into structured data in seconds.

Key User Flow
The PDF Penguin homepage was intentionally designed to reflect the product's core promise: instant transformation of unstructured PDFs into usable data.

- Upload a PDF — Users drag and drop a file or click to upload directly into the left-side panel.
- Customize the Output — A prompt input field allows users to describe exactly what kind of data they want to extract.
- Receive Structured JSON — The right-side panel displays the generated JSON output in real time.
This flow was designed to minimize user friction, reduce the need for technical knowledge, and deliver immediate visual feedback. The layout and design decisions prioritize first-time usability while giving power users flexibility to define their output format.
Build Process
This project was coded entirely using Cursor, an AI-native coding environment. I leveraged its inline generation, autocompletion, and iterative coding features to build and refine the full frontend and backend without switching tools. Cursor's fluid AI-assisted workflow allowed me to move quickly from concept to implementation, especially in structuring the prompt logic and dynamic output panel.
- Frontend: React + TailwindCSS
- Backend: AI pipeline integrating OCR and OpenAI API for document parsing
- Deployment: Vercel
The AI interprets user instructions from a prompt field and uses document layout detection to output structured key-value JSON data, even from unstructured or scanned documents.
Challenges & Lessons Learned
Challenge
Users were uploading low-resolution or scanned PDFs that caused inconsistent parsing and frustrating results.
Solution
I added a prompt customization field to guide the AI, and included light UX copy to educate users on how to phrase good instructions or prepare better PDFs.
Lesson
AI isn't magic — but good UX can make it feel like it is. The best tools support both ideal and messy inputs, and guide users through uncertainty.
Future Improvements
- Add user authentication and upload history saving
- Allow exports to CSV and XML formats
- Improve support for low-quality scanned PDFs
- Mobile-responsive improvements
Impact
Since building PDF Penguin, I've used it to power real-time parsing for multiple EZ Recipe recipes and shared it with other developers who've since used it in document-heavy workflows. It's now a core part of my toolset and continues to inspire ideas for standalone API-based parsing services.