Project: Tech Talent Recruiting with Regex

As a Data Analyst at a leading global HR consultancy, your mission is to delve into an extensive database of resumes to identify suitable candidates for tech-focused roles. This task involves using regular expressions to extract key data points and applying data preprocessing techniques to organize this information effectively.

Dataset Summary

resumes.csv

Column	Data Type	Description
`ID`	float	Unique identifier for each resume.
`Resume_str`	object	Full text of the resume, rich with details for analysis.
`Category`	object	Job category of the resume, indicating the field of expertise.

Let's Get Started!

Embark on this analytical journey to harness advanced data analysis techniques for real-world HR challenges. This project is your chance to impact the hiring process by ensuring that tech talent finds their ideal job. Let's begin this exciting journey!

import pandas as pd
import re

# Load the resume dataset from a CSV file into a DataFrame
resumes = pd.read_csv('resumes.csv')
resumes.sample(3)

regex_skills = r"\b(python|sql|r|excel)\b"
regex_job_title = r"\b([A-Z\s\.\,\-]+)\b"
regex_education = r"\b(PHD|MCs|Master|BCs|Bachelor)\b"

job_titles = []
tech_skills = []
educations = []

for resume in resumes['Resume_str']:
    job_title_match = re.search(regex_job_title, resume)
    if job_title_match is not None:
        job_title = job_title_match.group(0).strip()
    else:
        job_title = ""
    job_titles.append(job_title)
    
    skills_match = re.findall(regex_skills, resume, flags=re.IGNORECASE)
    unique_skills = []
    for skill in skills_match:
        skill_title = skill.title()
        if skill_title not in unique_skills:
            unique_skills.append(skill_title)
    tech_skills.append(", ".join(unique_skills))
    
    education_matches = re.findall(regex_education, resume, flags=re.IGNORECASE)
    unique_education = []
    for education in education_matches:
        education_title = education.title()
        if education_title not in unique_education:
            unique_education.append(education_title)
    educations.append(" ,".join(unique_education))
    
resumes['job_title'] = job_titles
resumes['tech_skills'] = tech_skills
resumes['education'] = educations

resumes_filtered = resumes[(resumes['job_title'] != "") & (resumes['tech_skills'] != "") & (resumes['education'] != "")]
candidates_df = resumes_filtered[["ID", "job_title", "tech_skills", "education"]]

candidates_df.columns = candidates_df.columns.str.lower()


candidates_df.dropna(inplace=True)

candidates_df.sample(10)

Project: Tech Talent Recruiting with Regex

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Dataset Summary

Let's Get Started!

Dataset Summary