Skip to content

feat: photoMaker in imageInference, IIpAdapter (guideImages, etc.), IEmbedding weight, dict coercion for nested options#251

Open
Sirsho1997 wants to merge 6 commits intomainfrom
fix-schemaChanges
Open

feat: photoMaker in imageInference, IIpAdapter (guideImages, etc.), IEmbedding weight, dict coercion for nested options#251
Sirsho1997 wants to merge 6 commits intomainfrom
fix-schemaChanges

Conversation

@Sirsho1997
Copy link
Collaborator

@Sirsho1997 Sirsho1997 commented Mar 5, 2026

Added

  • Added photoMaker as nested object in imageInference using IPhotoMaker; IImageInference.photoMaker: Optional[Union[IPhotoMaker, Dict[str, Any]]]. IPhotoMaker has optional model, positivePrompt, height, width for nested use; images and inputImages (standalone uses inputImages, imageInference uses images). Existing taskType photoMaker unchanged; standalone validates required model, positivePrompt, height, width.
  • Added photoMaker payload building in imageInference flow (photo_maker_data) and _addImageSpecialFields; existing taskType photoMaker unchanged.
  • Added IIpAdapter.guideImages, combineMethod, weightType, embedScaling, weightComposition; guideImage optional; base supports guideImages with process_image.
  • Added IEmbedding.weight: Optional[float].
  • Added dict coercion in IImageInference.post_init for photoMaker, instantID, acePlusPlus, puLID, ultralytics, outpaint, refiner; list coercion for embeddings (dict → IEmbedding) and ipAdapters (dict → IIpAdapter).

Changed

  • IPhotoMaker: model, positivePrompt, height, width now Optional for use as nested photoMaker in imageInference; added images: Optional[List[Union[str, File]]]. Removed post_init validation (max 4 images, style).
  • IImageInference: refiner, outpaint, instantID, acePlusPlus, puLID, ultralytics, photoMaker typed as Union[Type, Dict[str, Any]]; embeddings as List[Union[IEmbedding, Dict[str, Any]]]; ipAdapters as List[Union[IIpAdapter, Dict[str, Any]]].

…Embedding weight, dict coercion for nested options
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the image inference request types and request-building flow to support PhotoMaker settings as a nested object, expands IP-Adapter options (including multi-image guidance), adds IEmbedding.weight, and adds dict/list coercion for several nested options inside IImageInference.

Changes:

  • Added IPhotoMakerSettings and IImageInference.photoMaker, plus payload building and request wiring in the image-inference flow.
  • Expanded IIpAdapter fields (including guideImages) and added dict/list coercion for ipAdapters and embeddings items.
  • Added IEmbedding.weight and consolidated PhotoMaker style validation into a shared _PHOTO_MAKER_VALID_STYLES constant.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
runware/types.py Adds new/expanded request dataclasses and IImageInference.__post_init__ coercion to accept nested dicts/lists.
runware/base.py Updates image-inference request building to process guideImages and include photoMaker payload in the request.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Sirsho1997 Sirsho1997 requested a review from Copilot March 5, 2026 20:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

runware/base.py:668

  • expected_results is set from requestPhotoMaker.numberResults, but IPhotoMaker.numberResults is now optional and may be None. That will break _handle_pending_operation_message (len(results) >= op["expected"] will raise TypeError). Default numberResults to an int (e.g., or 1) before registering the pending op (and ideally before putting it in the request payload).
        future, should_send = await self._register_pending_operation(
            task_uuid,
            expected_results=numberOfResults,
            complete_predicate=None,
            result_filter=lambda r: r.get("imageUUID") is not None

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +746 to 750
if "inputImages" in instant_id_data:
instant_id_data["inputImages"] = await process_image(instant_id_data["inputImages"])
else:
instant_id_data["inputImages"] = await process_image([instant_id_data["inputImage"]])

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instant_id_data["inputImage"] can still be present (and unprocessed) while you also set inputImages. Additionally, the fallback path assumes inputImage exists and will raise KeyError if neither inputImages nor inputImage was provided. Consider normalizing to a single field before sending (e.g., require at least one of inputImages/inputImage, build inputImages, process it, and remove inputImage from the payload dict).

Suggested change
if "inputImages" in instant_id_data:
instant_id_data["inputImages"] = await process_image(instant_id_data["inputImages"])
else:
instant_id_data["inputImages"] = await process_image([instant_id_data["inputImage"]])
# Normalize instant_id_data images to a single `inputImages` field
input_images = instant_id_data.get("inputImages")
single_input_image = instant_id_data.get("inputImage")
if input_images is None and single_input_image is not None:
# Build inputImages list from a single inputImage
input_images = [single_input_image]
if input_images is not None:
instant_id_data["inputImages"] = await process_image(input_images)
# Remove the legacy single-image field to avoid unprocessed duplicates
instant_id_data.pop("inputImage", None)

Copilot uses AI. Check for mistakes.
response = results[0]
self._handle_error_response(response)
return createImageToTextFromResponse(response)
return instantiateDataclass(IImageToText, response)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching to instantiateDataclass(IImageToText, response) drops the previous response coercions (e.g., taskType string -> ETaskType and cost -> float). Since instantiateDataclass currently doesn’t handle Enum or numeric coercion, callers will now get raw strings for these fields. Either restore a dedicated response parser for IImageToText or extend instantiateDataclass to coerce Enum/numeric types consistently (similar to createEnhancedPromptsFromResponse).

Suggested change
return instantiateDataclass(IImageToText, response)
# Coerce response fields to expected types before dataclass instantiation.
# This restores previous behaviour where taskType was an ETaskType
# and cost was a float, rather than raw strings.
if isinstance(response, dict):
response_dict = dict(response)
else:
try:
# Attempt a shallow conversion for mapping-like objects
response_dict = dict(response)
except TypeError:
# Fallback: leave as-is if it cannot be converted
response_dict = response
if isinstance(response_dict, dict):
task_type_value = response_dict.get("taskType")
if isinstance(task_type_value, str):
try:
response_dict["taskType"] = ETaskType(task_type_value)
except ValueError:
# If the value is not a valid ETaskType, leave it unchanged
pass
cost_value = response_dict.get("cost")
if isinstance(cost_value, str):
try:
response_dict["cost"] = float(cost_value)
except (ValueError, TypeError):
# If cost is not a valid float, leave it unchanged
pass
return instantiateDataclass(IImageToText, response_dict)

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@teith teith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need also to fix the problems copilot found


photo_maker_data = {}
if requestImage.photoMaker:
photo_maker_data = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol. Will remove that double initialization

inputImages: Optional[List[Union[File, str]]] = None # Primary: array of images (UUIDs, URLs, base64, dataURI)
inputImage: Optional[Union[File, str]] = None # Convenience: single image, defaults to inputImages[0] if not provided
prompt: List[str] = field(default_factory=lambda: ["Describe this image in detail"]) # Array of prompts with default
prompt: Optional[List[str]] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? why is it optional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Age detection (subset of imageCaption) does not need a prompt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants