Skip to content

feat: Checkboxes and Radio Button Mapping for PDFs (GSoC)#407

Open
Dotify71 wants to merge 5 commits into
fireform-core:mainfrom
Dotify71:fix/startup-and-tests
Open

feat: Checkboxes and Radio Button Mapping for PDFs (GSoC)#407
Dotify71 wants to merge 5 commits into
fireform-core:mainfrom
Dotify71:fix/startup-and-tests

Conversation

@Dotify71
Copy link
Copy Markdown

@Dotify71 Dotify71 commented Mar 31, 2026

This PR implements Checkboxes and Radio Button Mapping (Deliverable 1 from my GSoC Proposal).

Advanced PDF Form Mapping (src/filler.py)

  • Previously, filler.py indiscriminately assigned the LLM string output to the Value (/V) property of all Widget types. This breaks when a PDF contains checkboxes or radio buttons (Field Type /Btn).
  • Implementation: Added conditional logic inside the loop passing through sorted_annots:
    • Detects if an annotation is a Button (/Btn).
    • Checks the generated LLM response for truthy values (yes, true, 1, x).
    • Dynamically extracts the correct internal ON state identifier directly from the PDF Appearance Dictionary (annot.AP.N).
    • Assigns both the Value (/V) and Appearance State (/AS) to accurately check the box on the final PDF layout, while assigning /Off when falsy.

Strict Boolean Typing (per maintainer feedback)

  • The fields dict now accepts Python types as values (e.g. {"is awake": bool})
  • build_prompt() detects bool fields and explicitly instructs the LLM to return only the literal string 'True' or 'False'.
  • add_response_to_json() strictly coerces LLM output to Python bool for boolean fields.
  • filler.py uses isinstance(answer, bool) to activate a checkbox/radio button.

(Note: The previous redundant fixes for #135 and #380 were dropped via a merge with upstream/main)

Dushyant Acharya and others added 3 commits March 31, 2026 23:51
Implemented PDF /Btn dictionary parsing in filler.py to extract and dynamically map truthy LLM outputs to their specific 'ON' Appearance Mode instead of blindly appending strings. Also resolved broken backend pipeline in main.py by initializing the base Controller instead of the removed Fill class.
@Dotify71
Copy link
Copy Markdown
Author

Hi @marcvergees @juanalvv! I've resolved the merge conflicts in Makefile and src/main.py. The PR is now up to date with the latest main branch and ready for review/merge. It includes the fixes for #135 and #380, along with the PDF checkbox/radio mapping feature. Thanks!

Copy link
Copy Markdown
Member

@marcvergees marcvergees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something you should need to care about. You've just implemented the fact that if the output of the LLM is something like "yes", "true", etc, and if the fillable field is a radiobutton, then you active it. But how you that the LLM is gonna return something like this? There should be something in the pipeline and the dict that gives all fields regarding booleans. E.g. the fields dict is {"is awake": boolean}, and tell the LLM to give a boolean specifically True or False, that way, we could verify that the field filled would be something like that. Please fix that and we'll be able to test and merge it.

Per maintainer feedback in fireform-core#407:
- The fields dict now accepts Python types as values (e.g. {'is awake': bool})
- build_prompt() detects bool fields and explicitly instructs the LLM to
  return only the literal string 'True' or 'False', not fuzzy values
- add_response_to_json() strictly coerces LLM output to Python bool for
  bool fields, logging a warning if an unexpected value is returned
- filler.py now uses isinstance(answer, bool) instead of string matching
  so only a guaranteed Python True activates a checkbox/radio button
- Updated example in main.py to demonstrate the new typed fields dict
@Dotify71
Copy link
Copy Markdown
Author

Dotify71 commented May 15, 2026

Hi @marcvergees! Thanks for the clear feedback. I've addressed your concern in the latest commit (9feeb78).

The fields dict now carries Python type annotations (e.g. {"is awake": bool}). When build_prompt() sees a bool field, it explicitly instructs the LLM: "You MUST respond with ONLY the literal word True or False." The response is then strictly coerced to a Python bool in add_response_to_json() — any unexpected value logs a warning and defaults to None. Finally, filler.py uses isinstance(answer, bool) instead of fuzzy string matching, so only a guaranteed Python True activates a checkbox/radio button. Ready for re-review!

@Dotify71 Dotify71 changed the title fix: import Union in main.py and correct pytest directory in Makefile feat: Checkboxes and Radio Button Mapping for PDFs (GSoC) May 24, 2026
@Dotify71 Dotify71 requested a review from marcvergees May 24, 2026 08:21
@Dotify71
Copy link
Copy Markdown
Author

Hi @marcvergees! I've updated the PR. I pulled in the latest changes from main to resolve the merge conflicts and dropped my old redundant fixes for #135 and #380 (since you guys already took care of those!).

The PR is now strictly focused on the GSoC Checkbox and Radio Button mapping feature, and it incorporates your feedback regarding strict boolean typing. It's fully up to date and ready for you to take another look whenever you have a moment. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants