PDF Penguin

AI-powered PDF to JSON conversion for structured, usable data

A comprehensive document processing tool that transforms unstructured PDF data into clean, structured JSON for developers and data analysts.

🎯 Project Summary

TL;DR

Problem: Non‑technical users struggle to extract structured data from PDFs quickly.
Solution: Drag‑drop PDF → prompt desired structure → instant JSON output.
Success criteria: time‑to‑first‑output ≤ 10s, ≥ 90% parse success for common docs, clear fallback for scans.

While developing my recipe generator app, EZ Recipe, I needed a way to extract structured ingredient and nutrition data from USDA PDFs. The datasets were available, but they were formatted as complex, unstructured PDFs that were difficult to work with programmatically. Existing tools were unreliable or too technical, so I built a clean, AI-powered tool for parsing and exporting PDF data into usable JSON.

PDF Penguin has since evolved into a standalone product with broader application across document-heavy industries.

My Role

This was a completely solo project, where I handled every aspect from product vision to deployment.

Founder
Product Designer (UX & UI)
Frontend Developer (React, TailwindCSS)
AI Integration (Vision Models)

Problem & Goal

Problem

The biggest issue with existing PDF parsers was that they were overly technical — requiring specialized setup, manual formatting, command-line usage, or developer-only integrations. Tools like Tabula and Adobe's OCR exports were powerful but inaccessible to non-technical users. Many required users to predefine table structures or fiddle with JSON schemas before seeing results, which added friction for those just trying to extract usable information from documents.

Additionally, tools often failed with scanned documents or image-based PDFs, offering inconsistent or incomplete results. Even when they worked, the interfaces were cluttered and required unnecessary steps or downloads.

Success Metrics

≤ 10s time-to-first-output
90%+ parse success rate for common documents
Clear fallback guidance for scanned PDFs
Zero setup required for first use

Before (Current Tools)

Users spend 15+ minutes manually copying data from PDFs, dealing with complex setup, or getting inconsistent results from existing parsers.

After (PDF Penguin)

Drag, drop, type what you want — and get clean JSON in 10 seconds. No setup. No training. Just output.

PDF Penguin

I set out to design a tool that required zero onboarding: drag, drop, type what you want — and get clean JSON instantly. No setup. No training. Just output.

Design Process

Step 1: Empathize & Research

I explored a range of existing PDF parsing tools including Tabula, Adobe Acrobat's OCR export, and DocParser. While all three were technically capable, they presented significant barriers for non-technical users — requiring either installation, rule-building, or an understanding of export settings and schemas. I tested each one by attempting to extract structured data without relying on documentation or setup guides, simulating the experience of a first-time user with minimal technical background.

Across the board, I encountered slow onboarding, confusing interfaces, and results that required multiple adjustments or retries. Even simple use cases like "extract table data" demanded upfront learning or configuration. These pain points reinforced a clear gap in the space: a need for a tool that provides structure, flexibility, and results — without setup or specialized knowledge.

The table below summarizes how PDF Penguin compares to these tools, based on that usability-first evaluation.

Category	Tabula (Open Source)	Adobe Acrobat Export	DocParser	PDF Penguin
How to Use	Install app → select table area → export	Open PDF → Export To (Excel/Word) → fix formatting	Create parser → define rules/schema → run	Drag & drop PDF → describe output in plain language → copy JSON
Ease of Use	Manual area selection every time — long PDFs are painful	Export menus are cluttered; results vary per doc	Requires technical rule-building	Zero setup — natural prompts, instant results
Learning Curve	Steep — geared to developers	Moderate — still need guides/tutorials	High — requires schema expertise	None — works instantly
Setup Time	10–15 mins for first export; longer for large PDFs	5–10 mins per document, plus fixing errors	15–30 mins upfront per parser	<10s to first output
Output Quality	Raw, messy tables — struggles with multi-page PDFs	Exports often break formatting; images/tables distort	Accurate only if rules are perfect; brittle if format changes	Clean, structured JSON every time
Falls Short On…	Tedious with long/multi-page PDFs; messy copy-paste cleanup	Time wasted fixing Excel/Word errors; inconsistent exports	High upfront effort, breaks when PDF layout changes	Designed to handle any PDF instantly
Target Audience	Developers tinkering on small files	Business users with patience for cleanup	Technical ops/data teams	Anyone — including non-technical users

Findings: Competing tools work well but expect users to know setup, schemas, and exports. PDF Penguin is the fastest and simplest to operate: drag & drop, describe the structure, copy clean JSON.

Tabula (Open Source)

How to Use: Install app → select table area → export

Ease of Use: Manual area selection every time — long PDFs are painful

Learning Curve: Steep — geared to developers

Setup Time: 10–15 mins for first export

Adobe Acrobat Export

How to Use: Open PDF → Export To (Excel/Word) → fix formatting

Ease of Use: Export menus are cluttered; results vary per doc

Learning Curve: Moderate — still need guides/tutorials

Setup Time: 5–10 mins per document, plus fixing errors

DocParser

How to Use: Create parser → define rules/schema → run

Ease of Use: Requires technical rule-building

Learning Curve: High — requires schema expertise

Setup Time: 15–30 mins upfront per parser

PDF Penguin

How to Use: Drag & drop PDF → describe output in plain language → copy JSON

Ease of Use: Zero setup — natural prompts, instant results

Learning Curve: None — works instantly

Setup Time: <10s to first output

Findings: Competing tools work well but expect users to know setup, schemas, and exports. PDF Penguin is the fastest and simplest to operate: drag & drop, describe the structure, copy clean JSON.

Step 2: Define

PDFs are the universal standard for sharing information, but they weren’t designed for easy data extraction. From our research, one theme was clear: the biggest barrier with existing tools wasn’t raw technical capability — it was usability.

Tabula required manually drawing boxes around tables, a painful process on long or multi-page PDFs.

Adobe Acrobat exported to Word or Excel, but the formatting often broke, leaving users to waste time fixing errors.

DocParser offered accuracy, but only after heavy upfront investment in parser setup, schema design, and ongoing maintenance whenever the PDF layout changed.

Across all of these, the “define” step was the bottleneck: users were forced to either manually mark up data, accept inaccurate exports, or engineer schemas. None of these approaches fit the needs of non-technical users who just want structured data quickly.

Problem Statement

Extracting structured data from PDFs today is slow, technical, and inconsistent. Users need a solution that requires no setup, no manual definition, and no technical expertise — while still delivering clean, structured outputs instantly.

Step 3: Ideate

I brainstormed multiple UI approaches, including single-page processing (streamlined but potentially overwhelming), wizard-style multi-step (guided but slow), and two-panel layout (immediate feedback, familiar pattern).

I decided to move forward with a two-panel layout. This structure offered the clearest balance of simplicity and control: users could upload a PDF on the left while immediately seeing the structured output on the right. Unlike a single-page or wizard flow, the two-panel model provided instant feedback without overwhelming users with steps or clutter.

User Flow Mapping

With this direction in place, the next step was to validate how users would actually move through the product. To do this, I mapped out the complete user flow — from uploading a file, to defining the output, to copying the final JSON. This flow chart helped surface potential friction points and confirm that the two-panel design supported a smooth, low-effort experience.

PDF Penguin User Flow Chart showing the step-by-step process of uploading, processing, and extracting data from PDFs

💡 The flow shows the complete user journey from upload to structured output

Step 4: Prototype & Design

With a clear layout concept from the ideation phase, I began prototyping to test the interface flow and validate the user experience. I started with low-fidelity wireframes to quickly iterate on the core interaction patterns before moving into high-fidelity development.

Lo-Fi Prototype

With the concept validated through research, I moved into rapid prototyping to test the interface flow. The low-fi wireframes focused on three key aspects: the upload process, prompt customization, and JSON output display within the clean, minimal interface.

1. Upload Interface

Clean drag-and-drop area with clear visual feedback for file uploads.

2. JSON Output

Structured data display with syntax highlighting and copy functionality.

3. Library

Saved documents and parsed data organized in a clean library interface.

4. Library Document View

Viewing and managing individual documents from the library with parsed data.

Key Insights from Lo-Fi Testing:

Users preferred a two-panel layout for immediate visual feedback, the prompt field needed clear placeholder text to guide effective AI instructions, and error states required helpful messaging to guide users toward successful parsing.

Refinements Made:

I simplified the upload process to drag-and-drop only, added contextual placeholder text with examples, and integrated clear error messaging with actionable next steps for users.

High-Fidelity Prototypes

Based on the low-fidelity testing insights, I created polished high-fidelity prototypes that refined the visual design, improved the user experience, and prepared for final development.

1. Upload Interface

Clean, modern upload experience with drag-and-drop functionality, output format dropdown selection, and clear visual feedback.

2. JSON Output

Structured data display with syntax highlighting and copy functionality.

3. Library

Saved documents and parsed data organized in a clean library interface with color-coded file categories in the top left.

4. Library Document View

Specific document view showing detailed information with download and copy functionality for parsed data.

Try the interactive prototype:

Open Figma Prototype

Step 5: Test & Iterate

Each time I implemented a new UI or prompt behavior, I tested it by uploading different document types — invoices, receipts, reports — and refining the prompt UX to guide the AI parser. After discovering poor output from vague prompts, I added a customizable instruction field and clarified the placeholder text to guide user input.

Once the interface flow and AI behavior were reliable, I built the full app in Cursor with a React + TailwindCSS frontend, integrated OCR and OpenAI APIs, and deployed it via Vercel. The result is a working product with real users, capable of turning even messy PDFs into structured data in seconds.

Build Process

This project was coded entirely using Cursor, an AI-native coding environment. I leveraged its inline generation, autocompletion, and iterative coding features to build and refine the full frontend and backend without switching tools. Cursor's fluid AI-assisted workflow allowed me to move quickly from concept to implementation, especially in structuring the prompt logic and dynamic output panel.

Frontend: React + TailwindCSS
Backend: AI pipeline integrating OCR and OpenAI API for document parsing
Deployment: Vercel

The AI interprets user instructions from a prompt field and uses document layout detection to output structured key-value JSON data, even from unstructured or scanned documents.

Challenges & Lessons Learned

Challenge: Designing a Two-Click PDF Converter

The main challenge was keeping the experience as minimal as possible. Many converters overload users with extra steps — multiple menus, settings, or upsells — which slows them down. The goal for PDF Penguin was clear: users should be able to upload their file and get the converted result in just two clicks.

Lesson Learned

Stripping down features is harder than adding them. I had to carefully decide what was essential (upload → convert → download) and what could be excluded or postponed. This exercise taught me that simplicity isn't about doing less work — it's about making tough design choices to keep the user's path frictionless.

Challenge: Guiding Without Overwhelming

Even with a simple flow, users still need a sense of control (e.g., naming the file or choosing output format). Adding these options without cluttering the interface was a balancing act.

Lesson Learned

Clear defaults and progressive disclosure are key. By setting smart defaults, users can complete their task quickly, while still having the option to customize if needed.

Challenge: Maintaining Trust and Reliability

Because file conversion involves sensitive documents, users need to trust the process. Any hiccup — like unclear status indicators or unexpected results — could break that trust.

Lesson Learned

Feedback and transparency build confidence. Simple loading indicators, confirmation messages, and a visible "library" of past conversions reassured users that their files were safe and the process worked as expected.

🧠 Reflections & Takeaways

PDF Penguin might seem like a simple tool, but it addresses a fundamental gap in document processing. From data analysts to developers, people need to extract structured information from unstructured documents quickly and reliably. This tool empowers users to transform messy PDFs into usable data without technical barriers.

"I just need to get this table data into a spreadsheet, but the PDF is a mess."

As a designer and developer, this project taught me the value of solving real problems with simple solutions. Sometimes the best tools are the ones that eliminate complexity rather than adding features.

Key Learnings

AI + UX = Magic – The combination of AI capabilities with thoughtful user experience design can create tools that feel almost magical to use.
Simplicity Scales – The most powerful tools are often the simplest ones. PDF Penguin's success comes from doing one thing exceptionally well.
Real Problems, Real Solutions – Building tools to solve your own problems often leads to solutions that resonate with others facing similar challenges.

📈 Future Enhancements

Add user authentication and upload history saving
Allow exports to CSV and XML formats
Improve support for low-quality scanned PDFs
Mobile-responsive improvements

📍 Final Thoughts

This case study pushed me to think deeply about the intersection of AI capabilities and user experience design. PDF Penguin is about more than document processing — it's about making powerful technology accessible to everyone, regardless of their technical background.

Since building PDF Penguin, I've used it to power real-time parsing for multiple EZ Recipe recipes and shared it with other developers who've since used it in document-heavy workflows. It's now a core part of my toolset and continues to inspire ideas for standalone API-based parsing services.

Impact & Results

PDF Penguin has successfully achieved its core goals: reducing time-to-first-output to under 10 seconds, achieving high parse success rates for common document types, and providing a zero-setup experience that works for both technical and non-technical users. The tool has become an essential part of my development workflow and has been adopted by other developers facing similar document processing challenges.