Skip to content

The ETL+ Platform for GenAI

Welcome to Unstructured! We're trusted by 82% of the Fortune 1000 and used by over 60,000 organizations globally.

We automatically transform complex, unstructured data into clean, structured data for GenAI applications. Data is routed through dynamic transformation and enrichment pipelines to deliver the highest quality output to your LLM. Continuously. Effortlessly. Automatically.

To get started, check out our open source offerings:

Ready for a more performant and reliable experience? Try Unstructured for free today and experience the next evolution of ETL for GenAI applications.

Learn more:

  • Company Website - Transform complex, unstructured data into clean, structured data. Securely. Continuously. Effortlessly.
  • Extensive Documentation - Our comprehensive docs cover everything from getting started guides to in-depth API references, ensuring you have the resources you need to succeed.
  • Developer Community on Slack - Connect with fellow developers, share knowledge, and get support through our vibrant community Slack channel.

Popular repositories Loading

  1. unstructured unstructured Public

    Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website …

    HTML 12.5k 1k

  2. unstructured-api unstructured-api Public

    Python 803 173

  3. unstructured-inference unstructured-inference Public

    Python 191 68

  4. pipeline-sec-filings pipeline-sec-filings Public archive

    Preprocessing pipeline notebooks and API supporting text extraction from SEC documents

    Jupyter Notebook 147 36

  5. unstructured-python-client unstructured-python-client Public

    A Python client for the Unstructured Platform API

    Python 106 18

  6. unstructured-ingest unstructured-ingest Public

    HTML 98 51

Repositories

Showing 10 of 39 repositories
  • base-images Public

    Store Dockerfiles and Packer configs for images to use as a base to build upon

    Unstructured-IO/base-images’s past year of commit activity
    Shell 4 Apache-2.0 2 1 1 Updated Sep 3, 2025
  • unstructured-js-client Public

    A JavaScript/Typescript client for the Unstructured Platform API

    Unstructured-IO/unstructured-js-client’s past year of commit activity
    TypeScript 57 MIT 17 6 2 Updated Sep 3, 2025
  • unstructured-python-client Public

    A Python client for the Unstructured Platform API

    Unstructured-IO/unstructured-python-client’s past year of commit activity
    Python 106 MIT 18 13 3 Updated Sep 3, 2025
  • notebooks Public
    Unstructured-IO/notebooks’s past year of commit activity
    Jupyter Notebook 1 0 0 0 Updated Sep 2, 2025
  • unstructured Public

    Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

    Unstructured-IO/unstructured’s past year of commit activity
    HTML 12,547 Apache-2.0 1,030 172 (3 issues need help) 52 Updated Sep 3, 2025
  • docs Public

    Documentation for all Unstructured products and libraries

    Unstructured-IO/docs’s past year of commit activity
    MDX 7 25 0 7 Updated Sep 2, 2025
  • Unstructured-IO/unstructured-api’s past year of commit activity
    Python 803 Apache-2.0 173 31 9 Updated Aug 28, 2025
  • Unstructured-IO/unstructured-inference’s past year of commit activity
    Python 191 Apache-2.0 68 23 13 Updated Aug 25, 2025
  • Unstructured-IO/unstructured-ingest’s past year of commit activity
    HTML 98 Apache-2.0 51 56 29 Updated Aug 22, 2025
  • .github Public
    Unstructured-IO/.github’s past year of commit activity
    0 2 2 1 Updated Aug 20, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.