Skip to content

Keyword Analyzer

KeywordAnalyzer

KeywordAnalyzer(semantic_matcher=None)

Analyzer for matching keywords between job descriptions and resumes.

Uses multiple matching strategies: 1. Exact token matching (case-insensitive, normalized) 2. Fuzzy string matching (partial ratio >= 80%) 3. Semantic similarity (if semantic_matcher is provided, threshold >= 0.7)

Attributes:

Name Type Description
semantic_matcher

Optional semantic matcher for ML-powered similarity. If provided, enables semantic keyword matching in addition to exact and fuzzy matching.

Example
# Basic analyzer (exact + fuzzy matching only)
analyzer = KeywordAnalyzer()

# With semantic matching
from at_scorer.ml import SemanticMatcher, EmbeddingModel
model = EmbeddingModel()
matcher = SemanticMatcher(model)
analyzer = KeywordAnalyzer(semantic_matcher=matcher)

score, matched, missing = analyzer.analyze(resume_text, keywords)

Initialize the keyword analyzer.

Parameters:

Name Type Description Default
semantic_matcher SemanticMatcher | None

Optional semantic matcher for ML-powered similarity. If None, only exact and fuzzy matching will be used.

None
Source code in at_scorer/analyzers/keyword.py
def __init__(self, semantic_matcher: SemanticMatcher | None = None):
    """Initialize the keyword analyzer.

    Args:
        semantic_matcher: Optional semantic matcher for ML-powered similarity.
            If None, only exact and fuzzy matching will be used.
    """
    self.semantic_matcher = semantic_matcher

Functions

analyze
analyze(resume_text, keywords)

Analyze keyword matches between resume text and job keywords.

Matches keywords using multiple strategies: - Exact token matching (normalized, case-insensitive) - Fuzzy string matching (partial ratio >= 80%) - Semantic similarity (if semantic_matcher available, threshold >= 0.7)

Parameters:

Name Type Description Default
resume_text str

The full text content of the resume.

required
keywords Iterable[str]

Iterable of keywords from the job description to match.

required

Returns:

Type Description
tuple[float, list[str], list[str]]

Tuple containing: - score: Float between 0.0 and 1.0 representing the ratio of matched keywords to total keywords. - matched_keywords: List of keywords that were successfully matched. - missing_keywords: List of keywords that were not found.

Example
resume_text = "Experienced Python developer with FastAPI..."
keywords = ["Python", "FastAPI", "PostgreSQL", "Docker"]
score, matched, missing = analyzer.analyze(resume_text, keywords)
# score might be 0.75 (3 out of 4 matched)
# matched: ["Python", "FastAPI", "PostgreSQL"]
# missing: ["Docker"]
Source code in at_scorer/analyzers/keyword.py
def analyze(
    self, resume_text: str, keywords: Iterable[str]
) -> tuple[float, list[str], list[str]]:
    """Analyze keyword matches between resume text and job keywords.

    Matches keywords using multiple strategies:
    - Exact token matching (normalized, case-insensitive)
    - Fuzzy string matching (partial ratio >= 80%)
    - Semantic similarity (if semantic_matcher available, threshold >= 0.7)

    Args:
        resume_text: The full text content of the resume.
        keywords: Iterable of keywords from the job description to match.

    Returns:
        Tuple containing:
            - score: Float between 0.0 and 1.0 representing the ratio of
              matched keywords to total keywords.
            - matched_keywords: List of keywords that were successfully matched.
            - missing_keywords: List of keywords that were not found.

    Example:
        ```python
        resume_text = "Experienced Python developer with FastAPI..."
        keywords = ["Python", "FastAPI", "PostgreSQL", "Docker"]
        score, matched, missing = analyzer.analyze(resume_text, keywords)
        # score might be 0.75 (3 out of 4 matched)
        # matched: ["Python", "FastAPI", "PostgreSQL"]
        # missing: ["Docker"]
        ```
    """
    normalized_resume = normalize_text(resume_text)
    resume_tokens = set(tokenize(normalized_resume))
    normalized_keywords = [normalize_text(k) for k in keywords if k]
    matched: list[str] = []
    missing: list[str] = []

    for kw in normalized_keywords:
        if not kw:
            continue
        if kw in resume_tokens:
            matched.append(kw)
            continue
        fuzzy_score = fuzz.partial_ratio(kw, normalized_resume)
        semantic_score = 0.0
        if self.semantic_matcher:
            semantic_score = self.semantic_matcher.similarity(kw, normalized_resume)
        if fuzzy_score >= 80 or semantic_score >= 0.7:
            matched.append(kw)
        else:
            missing.append(kw)

    total = len(normalized_keywords) or 1
    score = len(matched) / total
    return score, matched, missing