Skip to content

This Python script processes images in a specified folder, sends them to the OpenAI API and saves the responses as text files.

License

Notifications You must be signed in to change notification settings

erenokur/openai-mass-image-requests

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenAI Mass Image Requests

This Python script processes images in a specified folder, sends them to the OpenAI API and saves the responses as text files.

Features

  • Image Encoding: Encodes images into base64 format for API requests.
  • API Interaction: Sends images and prompts to the OpenAI API to generate descriptions or answers related to the images.
  • MIME Type Handling: Determines the correct MIME type for various image formats (.jpg, .jpeg, .png, .gif, .bmp).
  • Folder Management: Automatically creates necessary folders (images and images/answers) if they don't exist.
  • Error Handling: Includes basic error handling for file operations and API requests.

Prerequisites

  • Python 3.8 or higher

  • OpenAI Python library (pip install openai)

  • python-dotenv library (pip install python-dotenv)

  • An OpenAI API key

  • A .env file in the root directory of the script containing the following:

    OPENAI_API_KEY=<your_openai_api_key>
    ROLE_PROMPT=<your_system_role_prompt>
    CONTENT_PROMPT=<your_user_content_prompt>
    
    • OPENAI_API_KEY: Your OpenAI API key.
    • ROLE_PROMPT: The role prompt (system prompt) to use for the OpenAI API.
    • CONTENT_PROMPT: The content prompt (user prompt) to use for the OpenAI API.
    • OPENAI_MODEL: The model to use for the OpenAI API requests (optional, defaults to gpt-4o-mini).

Setup

  1. Clone the repository:

    git clone <repository_url>
    cd <repository_name>
  2. Install dependencies:

    pip install -r requirements.txt

    (Assuming you have a requirements.txt file with openai and python-dotenv)

  3. Create a .env file:

    • Create a file named .env in the root directory of your project.
    • Add your OpenAI API key, role prompt, and content prompt to the .env file as described in the "Prerequisites" section.

Usage

  1. Place images in the images folder:

    • Put the images you want to process into the images folder, which will be created automatically in the same directory as the script if it doesn't exist.
  2. Run the script:

    python main.py
  3. Find the responses:

    • The script will process each image in the images folder.
    • For each image (e.g., image1.jpg), a corresponding text file (e.g., image1_answer.txt) will be created in the images/answers folder containing the response from the OpenAI API.

Code Explanation

encode_image(image_path)

  • Takes an image path as input.
  • Opens the image in binary read mode ("rb").
  • Reads the image content.
  • Encodes the image data into a base64 string using base64.b64encode().
  • Decodes the base64 string to UTF-8 for compatibility with JSON.
  • Returns the base64 encoded image string.

image_requests(image_path, image_type)

  • Takes an image path and its extension as input.
  • Defines a dictionary mime_types to map image file extensions to their corresponding MIME types.
  • Calls encode_image() to get the base64 representation of the image.
  • Sends a request to the OpenAI API using client.chat.completions.create().
    • Specifies the model as "gpt-4o-mini".
    • Constructs the message with a system role and a user role.
      • System role includes the role_prompt defined in the .env file.
      • User role includes the content_prompt and the image data.
      • The image data is formatted as an image_url with the appropriate MIME type and the base64 encoded image.
    • Sets max_tokens to 300 to limit the response length.
  • Prints the raw API response.
  • Extracts the content of the response (the description or answer) from response.choices[0].message.content.
  • Returns the extracted content.

is_image(file_path)

  • Takes a file path as input.
  • Uses mimetypes.guess_type() to determine the MIME type of the file based on its extension.
  • Returns True if the MIME type starts with "image", indicating it's an image file; otherwise, returns False.

process_images_files(folder_path)

  • Takes a folder path as input.
  • Iterates through each file in the specified folder using os.listdir().
  • For each file, checks if it's an image using is_image().
  • If it's an image:
    • Extracts the file name and extension using os.path.splitext().
    • Constructs the full path to the image file.
    • Constructs the path for the corresponding answer file in the images/answers folder.
    • Checks if an answer file already exists. If not:
      • Calls image_requests() to get the response from the OpenAI API.
      • Writes the response to the answer file.
      • Prints a message indicating that the image was processed and the answer was saved.

check_folder(folder_path)

  • Takes a folder path as input.
  • Checks if the folder exists using os.path.exists().
  • If the folder doesn't exist, it creates it using os.makedirs().

if __name__ == "__main__":

  • Ensures that the code inside this block is executed only when the script is run directly (not imported as a module).
  • Gets the current working directory using os.getcwd() and sets it as app_path.
  • Constructs the path to the images folder.
  • Calls check_folder() to create the images folder and the images/answers subfolder if they don't exist.
  • Calls process_images_files() to process the images in the images folder.
  • Includes a try...except block to catch any exceptions during the process and print an error message.

Notes

  • The script assumes you are using the gpt-4o-mini model. You can modify the model parameter in image_requests() if you want to use a different model.
  • The script currently has a hardcoded max_tokens value of 300. You might need to adjust this based on your needs and the complexity of the expected responses.
  • Make sure to replace the placeholder values in the .env file with your actual API key and prompts.
  • This is a basic implementation. You can extend it further by adding features like batch processing, more sophisticated error handling, logging, and user interface elements.

Feedback

If you have any feedback about the project, please let me know. I am always looking for ways to improve the user experience.

About

This Python script processes images in a specified folder, sends them to the OpenAI API and saves the responses as text files.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages