ML Configuration
MLConfig
dataclass
MLConfig(
embedding_model="all-MiniLM-L6-v2",
spacy_model="en_core_web_sm",
similarity_threshold=0.7,
use_gpu=True,
cache_embeddings=True,
batch_size=32,
)
Configuration for machine learning models and parameters.
Controls the behavior of ML components in ProScorer, including embedding models, text analysis, and similarity thresholds.
Attributes:
| Name | Type | Description |
|---|---|---|
embedding_model |
str
|
Name of the sentence transformer model to use for embeddings. Default: "all-MiniLM-L6-v2" (fast, good quality). Other options: "all-mpnet-base-v2" (better quality, slower), "all-MiniLM-L12-v2" (balanced). |
spacy_model |
str
|
Name of the spaCy language model for text analysis. Default: "en_core_web_sm". Must be installed separately. |
similarity_threshold |
float
|
Minimum cosine similarity score (0-1) to consider two texts semantically similar. Default: 0.7. |
use_gpu |
bool
|
Whether to use GPU acceleration if available. Default: True. |
cache_embeddings |
bool
|
Whether to cache computed embeddings for reuse. Improves performance when scoring multiple resumes. Default: True. |
batch_size |
int
|
Batch size for processing embeddings. Default: 32. Increase for better GPU utilization, decrease for lower memory usage. |