Skip to content

best algorithm to use for image fitting? #835

Open
@chr4ss12

Description

@chr4ss12

hi,

I am trying to create an AI that can take input as a:

container width, container height, image width, image height, and face coordinates of a human face on that image, and it should output what is the best way to place that image in the container (given that scaling, translating) are allowed and it should optimise for displaying the image the best way in a way that it should try and preserve the human face.

Essentially this is used to show big images on smaller surfaces without using the classical resizing algorithms and keeping human face on it.

sample of training data:

    "input": {
        "imageWidth": 476,
        "imageHeight": 847,
        "faceX": 24,
        "faceY": 99,
        "faceWidth": 290,
        "faceHeight": 311,
        "containerWidth": 149,
        "containerHeight": 149
    },
    "output": {
        "scale": 0.319,
        "positionX": -1.64,
        "positionY": -1.61
    },
    

I wanted to ask what is the best way to train this model? I can provide around 100 training data inputs/outputs manually, however I do believe it is not going to be even close enough given I have so many inputs as well outputs for it to start predicting something meaningful, what do you think?

how would I be able to say that the image should never be scaled in a way that you can see "empty space", so essentially how can I hard code some hard rules into AI training network to say that:

  1. both positionX and positionY cant be more than 0 at the same time, as that indicates there is dead space on left and top.
  2. more rules like that to eliminate the dead space.

I could try and "normalize" some stuff such that:

I fold containerWidth and containerHeight into an aspectRatio, and assume if aspectRatio=1 then containerWidth=100, which would eliminate one variable.

If I were to fold imageWidth&height into an imageRatio and do the same thing, I could fold that one too, eventually getting:

    "input": {
        "imageRatio": ....,
        "containerRatio": ...,
        "faceX": 24,
        "faceY": 99,
        "faceWidth": 290,
        "faceHeight": 311,
    },
    "output": {
        "scale": 0.319,
        "positionX": -1.64,
        "positionY": -1.61
    },
    

but it is still a lot of inputs imo

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions