Keyword Analyzer

KeywordAnalyzer

KeywordAnalyzer(semantic_matcher=None)

Analyzer for matching keywords between job descriptions and resumes.

Uses multiple matching strategies: 1. Exact token matching (case-insensitive, normalized) 2. Fuzzy string matching (partial ratio >= 80%) 3. Semantic similarity (if semantic_matcher is provided, threshold >= 0.7)

Attributes:

Name	Type	Description
`semantic_matcher`		Optional semantic matcher for ML-powered similarity. If provided, enables semantic keyword matching in addition to exact and fuzzy matching.

Example

# Basic analyzer (exact + fuzzy matching only)
analyzer = KeywordAnalyzer()

# With semantic matching
from at_scorer.ml import SemanticMatcher, EmbeddingModel
model = EmbeddingModel()
matcher = SemanticMatcher(model)
analyzer = KeywordAnalyzer(semantic_matcher=matcher)

score, matched, missing = analyzer.analyze(resume_text, keywords)

Initialize the keyword analyzer.

Parameters:

Name	Type	Description	Default
`semantic_matcher`	`SemanticMatcher \| None`	Optional semantic matcher for ML-powered similarity. If None, only exact and fuzzy matching will be used.	`None`

Source code in at_scorer/analyzers/keyword.py

def __init__(self, semantic_matcher: SemanticMatcher | None = None):
    """Initialize the keyword analyzer.

    Args:
        semantic_matcher: Optional semantic matcher for ML-powered similarity.
            If None, only exact and fuzzy matching will be used.
    """
    self.semantic_matcher = semantic_matcher

Functions

analyze

analyze(resume_text, keywords)

Analyze keyword matches between resume text and job keywords.

Matches keywords using multiple strategies: - Exact token matching (normalized, case-insensitive) - Fuzzy string matching (partial ratio >= 80%) - Semantic similarity (if semantic_matcher available, threshold >= 0.7)

Parameters:

Name	Type	Description	Default
`resume_text`	`str`	The full text content of the resume.	required
`keywords`	`Iterable[str]`	Iterable of keywords from the job description to match.	required

Returns:

Type	Description
`tuple[float, list[str], list[str]]`	Tuple containing: - score: Float between 0.0 and 1.0 representing the ratio of matched keywords to total keywords. - matched_keywords: List of keywords that were successfully matched. - missing_keywords: List of keywords that were not found.

Example

resume_text = "Experienced Python developer with FastAPI..."
keywords = ["Python", "FastAPI", "PostgreSQL", "Docker"]
score, matched, missing = analyzer.analyze(resume_text, keywords)
# score might be 0.75 (3 out of 4 matched)
# matched: ["Python", "FastAPI", "PostgreSQL"]
# missing: ["Docker"]

Source code in at_scorer/analyzers/keyword.py

def analyze(
    self, resume_text: str, keywords: Iterable[str]
) -> tuple[float, list[str], list[str]]:
    """Analyze keyword matches between resume text and job keywords.

    Matches keywords using multiple strategies:
    - Exact token matching (normalized, case-insensitive)
    - Fuzzy string matching (partial ratio >= 80%)
    - Semantic similarity (if semantic_matcher available, threshold >= 0.7)

    Args:
        resume_text: The full text content of the resume.
        keywords: Iterable of keywords from the job description to match.

    Returns:
        Tuple containing:
            - score: Float between 0.0 and 1.0 representing the ratio of
              matched keywords to total keywords.
            - matched_keywords: List of keywords that were successfully matched.
            - missing_keywords: List of keywords that were not found.

    Example:
        ```python
        resume_text = "Experienced Python developer with FastAPI..."
        keywords = ["Python", "FastAPI", "PostgreSQL", "Docker"]
        score, matched, missing = analyzer.analyze(resume_text, keywords)
        # score might be 0.75 (3 out of 4 matched)
        # matched: ["Python", "FastAPI", "PostgreSQL"]
        # missing: ["Docker"]
        ```
    """
    normalized_resume = normalize_text(resume_text)
    resume_tokens = set(tokenize(normalized_resume))
    normalized_keywords = [normalize_text(k) for k in keywords if k]
    matched: list[str] = []
    missing: list[str] = []

    for kw in normalized_keywords:
        if not kw:
            continue
        if kw in resume_tokens:
            matched.append(kw)
            continue
        fuzzy_score = fuzz.partial_ratio(kw, normalized_resume)
        semantic_score = 0.0
        if self.semantic_matcher:
            semantic_score = self.semantic_matcher.similarity(kw, normalized_resume)
        if fuzzy_score >= 80 or semantic_score >= 0.7:
            matched.append(kw)
        else:
            missing.append(kw)

    total = len(normalized_keywords) or 1
    score = len(matched) / total
    return score, matched, missing