image_classification.clip_model
clip_model.py
Defines CLIPModelManager for loading an OpenCLIP model, preprocessing, and performing inference tasks such as image encoding, text encoding, classification, and similarity computation.
CLIPModelManager Objects
class CLIPModelManager()
Manage an OpenCLIP model and ImageNet class data for image classification.
Attributes:
model
- Loaded OpenCLIP model.preprocess
- Preprocessing pipeline for image inputs.tokenizer
- Tokenizer for text inputs.imagenet_labels
- Mapping from class index to label info.text_descriptions
- List of text prompts for classification.model_name
- Name of the OpenCLIP architecture.pretrained
- Identifier for pretrained weights.imagenet_classes_path
- Optional path to custom ImageNet JSON.load_time
- Time taken to load the model.is_loaded
- Whether model and classes have been initialized.device
- Torch device used for inference.
__init__
def __init__(model_name: str = "ViT-B-32",
pretrained: str = "laion2b_s34b_b79k",
model_path: Optional[str] = None,
imagenet_classes_path: Optional[str] = None) -> None
Initialize CLIPModelManager with configuration parameters.
Arguments:
model_name
- Architecture name to load.pretrained
- Pretrained weight identifier.model_path
- Optional local path to model checkpoint.imagenet_classes_path
- Optional path to ImageNet class JSON.
initialize
def initialize() -> None
Load the OpenCLIP model, transforms, tokenizer, and ImageNet classes.
Raises:
RuntimeError
- If model or classes fail to load.
encode_image
def encode_image(image_bytes: bytes) -> np.ndarray
Encode raw image bytes into a unit-normalized feature vector.
Arguments:
image_bytes
- Raw image data in bytes.
Returns:
1D numpy array of feature values.
Raises:
RuntimeError
- If model is not initialized.
encode_text
def encode_text(text: str) -> np.ndarray
Encode text into a unit-normalized feature vector.
Arguments:
text
- Input text string.
Returns:
1D numpy array of feature values.
Raises:
RuntimeError
- If model is not initialized.
classify_image_with_labels
def classify_image_with_labels(image_bytes: bytes,
target_labels: Optional[List[str]] = None,
top_k: int = 3) -> List[Tuple[str, float]]
Classify an image against specified or default labels.
Arguments:
image_bytes
- Raw image data in bytes.target_labels
- Optional list of label names to restrict classification.top_k
- Number of top predictions to return.
Returns:
A list of (label, score) tuples for the top_k predictions.
Raises:
RuntimeError
- If model is not initialized.
compute_similarity
def compute_similarity(image_bytes: bytes, text: str) -> float
Compute cosine similarity between an image and a text string.
Arguments:
image_bytes
- Raw image data in bytes.text
- Text input for comparison.
Returns:
Cosine similarity score.
Raises:
RuntimeError
- If model is not initialized.
get_model_info
def get_model_info() -> Dict[str, Any]
Retrieve model configuration and status information.
Returns:
Dictionary containing model_name, pretrained, device, load_time, and number of ImageNet classes loaded.
image_classification.clip_service
clip_service.py
Provides gRPC service implementation for OpenCLIP operations, delegating requests to a CLIPModelManager to perform image encoding, text embedding, classification, similarity computation, health check, and model management.
CLIPService Objects
class CLIPService()
gRPC service for OpenCLIP model operations.
Wraps a CLIPModelManager instance and exposes methods for:
- Initializing the model
- Image processing
- Text embedding
- Similarity computation
- Health checks
- Switching models at runtime
- Retrieving service and model metadata
__init__
def __init__(model_name: str = "ViT-B-32",
pretrained: str = "laion2b_s34b_b79k",
model_path: Optional[str] = None,
imagenet_classes_path: Optional[str] = None) -> None
Initialize the CLIPService.
Arguments:
model_name
- Name of the OpenCLIP architecture to load.pretrained
- Identifier for pretrained weights.model_path
- Optional path to a custom model checkpoint.imagenet_classes_path
- Optional path to an ImageNet class index JSON.
initialize
def initialize() -> None
Load and initialize the OpenCLIP model and class data.
Raises:
RuntimeError
- If model initialization fails.
process_image_for_clip
def process_image_for_clip(request: ml_service_pb2.ImageProcessRequest,
context) -> ml_service_pb2.ImageProcessResponse
Handle an ImageProcessRequest via CLIPModelManager.
Arguments:
request
- Protobuf request containing image bytes, image_id, optional target_labels and top_k.context
- gRPC context for setting status codes and details.
Returns:
An ImageProcessResponse with feature vector, label scores, model_version, and processing_time_ms set. Status codes will be set on context for invalid input or internal errors.
get_text_embedding_for_clip
def get_text_embedding_for_clip(
request: ml_service_pb2.TextEmbeddingRequest,
context) -> ml_service_pb2.TextEmbeddingResponse
Handle a TextEmbeddingRequest via CLIPModelManager.
Arguments:
request
- Protobuf request containing a non-empty text field.context
- gRPC context for status codes.
Returns:
A TextEmbeddingResponse with text_feature_vector, model_version, and processing_time_ms. Status codes set for invalid input.
compute_similarity_for_clip
def compute_similarity_for_clip(image_bytes: bytes, text: str) -> float
Compute cosine similarity between image and text embeddings.
Arguments:
image_bytes
- Raw image bytes.text
- Input text string.
Returns:
Cosine similarity score as a float.
Raises:
RuntimeError
- If the model is not initialized.
health_check
def health_check(
service_name: str = "openclip") -> ml_service_pb2.HealthCheckResponse
Perform a health check of the CLIPService.
Arguments:
service_name
- Identifier returned in the response.
Returns:
A HealthCheckResponse with status, model_name, model_version, uptime_seconds, and a descriptive message.
get_model_info
def get_model_info() -> Dict[str, Any]
Retrieve metadata about the current model and service.
Returns:
A dictionary containing model_name, pretrained weights, device, is_loaded, load_time, and service uptime.
switch_model
def switch_model(new_model_name: str,
new_pretrained: str,
model_path: Optional[str] = None) -> bool
Replace the current CLIP model with a new one at runtime.
Arguments:
new_model_name
- Name of the new model architecture.new_pretrained
- Identifier for the new pretrained weights.model_path
- Optional checkpoint path for the new model.
Returns:
True if the switch succeeds; raises on failure.
get_performance_stats
def get_performance_stats() -> Dict[str, Any]
Gather runtime performance statistics for the service.
Returns:
A dictionary with model, device, uptime, load_time, and health status.