Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate AST-Friendly Resume #148

Closed
neviaumi opened this issue Feb 24, 2025 · 2 comments
Closed

Generate AST-Friendly Resume #148

neviaumi opened this issue Feb 24, 2025 · 2 comments

Comments

@neviaumi
Copy link
Owner

Currently, the generated resume from the project is designed to be human-friendly, but it doesn’t seem to perform well when parsed by AST-based systems.

When submitting the resume to application sites with an "auto-import" feature, the experience is often poor as the parsing fails.

This issue may be caused by the presence of highlighted keywords or other formatting elements which interfere with accurate parsing.


Key Ideas

  1. Adjusting the Output for AST Compatibility:

    • Modify the generated resume output to ensure it is AST-friendly, making it easier to parse by application systems.
  2. Testing for AST Compatibility:

    • Leverage an existing way to test whether the generated PDF is AST-friendly.
    • Run the generated resume through common ATS (Application Tracking System) environments as part of the testing process to ensure compatibility.
  3. Removing Problematic Formatting:

    • Investigate and resolve issues caused by elements like highlighted keywords or other non-standard formatting that may break AST parsers.

Expected Outcome

The new feature should generate resumes that:

  • Preserve a professional human-friendly appearance.
  • Are compatible with AST/ATS systems to avoid parsing issues.

By addressing this gap, the resumes can perform optimally in both human and automated review systems, ensuring they are effectively processed by application platforms.

Let me know if additional context or clarification is needed!

@neviaumi
Copy link
Owner Author

How to Test AST Compatibility with Resumes

Testing AST compatibility ensures your generated resume works seamlessly with Application Tracking Systems (ATS). Here’s how to approach it:


Key Challenges with AST Parsing

  1. PDF Formatting Issues: Tables, columns, charts, and complex layouts can cause parsing errors.
  2. Special Formatting: Highlighted keywords, non-standard fonts, and symbols may confuse parsers.
  3. Improper Structuring: Missing or unclear sections (e.g., Work Experience, Education) reduce compatibility.

How to Test for AST Compatibility

1. Manual Testing with ATS Tools

  • Platforms to Try:
  • Upload your resume to these tools and analyze parsing results.
  • Test it on job portals like LinkedIn, Indeed, or Workday.

2. Automated PDF Parsing Locally

  • Use libraries like:
  • Example script: Extract text from the resume and validate critical sections (e.g., Work Experience, Skills, Education).
const pdfParser = require('pdf-parse');
const fs = require('fs');

const dataBuffer = fs.readFileSync('./path-to-resume/resume.pdf');

pdfParser(dataBuffer).then((data) => {
  const text = data.text;

  console.log('Parsed content:', text);

  const hasExperience = text.includes('Work Experience');
  const hasEducation = text.includes('Education');
  const hasSkills = text.includes('Skills');

  if (hasExperience && hasEducation && hasSkills) {
    console.log('AST-friendly: All sections found.');
  } else {
    console.log('AST issue: Missing sections.');
  }
});

3. Use OCR Simulations

  • Use Tesseract.js to convert the resume into plain text and validate its structure.

Automating AST Testing with Playwright

You can use Playwright to automate AST testing on real job portals or simulate ATS behavior:

const { test, expect } = require('@playwright/test');

test('Upload and validate resume parsing', async ({ page }) => {
  await page.goto('https://example-job-portal.com');
  await page.fill('#email', '[email protected]');
  await page.fill('#password', 'password123');
  await page.click('button[type="submit"]');

  await page.click('text=Upload Resume');
  const resumePath = './path-to-resume/resume.pdf';
  await page.setInputFiles('input[type="file"]', resumePath);

  await page.waitForSelector('.parsed-resume');
  const parsedResume = await page.textContent('.parsed-resume');

  expect(parsedResume).toContain('Work Experience');
  expect(parsedResume).toContain('Education');
  expect(parsedResume).toContain('Skills');
});

Suggested Workflow

  1. Generate the Resume: Automatically create a PDF via your system.
  2. Run Parsing Tests:
    • Use ATS tools (e.g., Jobscan, Job portals).
    • Test locally with libraries like pdf-parser or OCR tools like Tesseract.js.
  3. Validate Structure: Ensure all key sections (e.g., Work Experience, Skills) are present and correctly parsed.
  4. Integrate Tests into CI/CD Pipelines: Automate validation through Playwright or Node.js scripts for consistent testing.

These strategies ensure that resumes are both human-readable and ATS-compatible.

@neviaumi
Copy link
Owner Author

Summary: Using LaTeX and JSON for an ATS-Friendly Resume Workflow

Key Context:

  1. LaTeX Benefits:

    • LaTeX is a powerful tool for creating professional, clean, and ATS-friendly resumes.
    • It offers precise control over formatting and generates highly readable and structured PDF files.
    • Existing LaTeX resume templates (e.g., moderncv, AltaCV) can be used or customized to fit specific needs.
  2. ATS-Friendly Design:

    • Ensures plain text structure with clear section headers and minimal visual decorations (no icons or images).
    • Sections like Work Experience, Education, and Skills follow a predictable layout for better ATS parsing.
  3. Dynamic Workflow:

    • JSON can serve as the data source for resume content.
    • A Node.js script can fetch the JSON data, construct LaTeX code dynamically, and generate the PDF using pdflatex.
    • A fully programmable approach allows for scalability, reusability, and automation.

High-Level Workflow

Step 1: JSON Resume Format

Store your resume in JSON format. Example:

{
  "basics": {
    "name": "John Doe",
    "email": "[email protected]",
    "phone": "+1234567890",
    "website": "https://johndoe.com"
  },
  "work": [
    {
      "position": "Full Stack Developer",
      "company": "XYZ Inc.",
      "startDate": "2020-01",
      "endDate": "Present",
      "summary": "Developed scalable backend systems using Django.",
      "highlights": [
        "Improved API performance by 30%.",
        "Collaborated with product designers and PMs."
      ]
    },
    {
      "position": "Intern",
      "company": "ABC Corp.",
      "startDate": "2018-06",
      "endDate": "2018-12",
      "summary": "Developed internal tools for automation."
    }
  ]
}

Step 2: LaTeX Template

Prepare a LaTeX template with placeholders for JSON values. Example:

\documentclass[a4paper,11pt]{article}
\usepackage[a4paper,margin=1in]{geometry}
\usepackage{url}

\begin{document}

% Header Section
\begin{center}
    {\Huge \textbf{<%= name %>}} \\
    Email: <%= email %> \quad | \quad Phone: <%= phone %> \quad | \quad Website: \url{<%= website %>}
\end{center}

% Work Experience
\section*{Work Experience}
<% work.forEach(function(job){ %>
  \textbf{<%= job.position %>} at \textit{<%= job.company %>} \\
  <%= job.startDate %> -- <%= job.endDate || "Present" %> \\
  <%= job.summary %> \\
  \begin{itemize}
    <% if(job.highlights) { job.highlights.forEach(function(highlight){ %>
      \item <%= highlight %>
    <% }) } %>
  \end{itemize}
<% }); %>

\end{document}

neviaumi added a commit that referenced this issue Feb 24, 2025
this part of #148
Introduce an ADR documenting the decision to use HTML for visually appealing, web-based resumes and LaTeX for ATS-friendly, structured PDF outputs. Both formats pull from a single JSON source to ensure consistency and reduce duplication while fulfilling distinct user and business needs.

Signed-off-by: David Ng <[email protected]>
neviaumi added a commit that referenced this issue Feb 24, 2025
GH-148: Add ADR for HTML and LaTeX-based resume generation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant