Skip to content

HelpersTask702_Describe_use_and_architecture_of_llm_transform.py_flow #703

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
118 changes: 118 additions & 0 deletions docs/tools/all.llm_transform.explanation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
<!-- toc -->

- [llm_transform.py - Architecture & Flow Explanation](#llm_transformpy---architecture--flow-explanation)
* [High Level Flow](#high-level-flow)
* [Architecture Diagrams (C4)](#architecture-diagrams-c4)
+ [System Context](#system-context)
+ [Container](#container)
+ [Component](#component)

<!-- tocstop -->

# llm_transform.py - Architecture & Flow Explanation

## High Level Flow

- **Argument parsing** – uses [`/helpers/hparser.py`](/helpers/hparser.py) to
normalise CLI flags.
- **Input acquisition** – [`/helpers/hio.py`](/helpers/hio.py) resolves `‑i` or
stdin and reads bytes.
- **Prompt selection** – `llm_prompts.py` maps the `‑p/--prompt-tag` value to a
concrete system/assistant prompt.
- **LLM invocation** – the request is handed to the generic client in
[`/helpers/hserver.py`](/helpers/hserver.py) through `llm_prompts.py`.
- **Post‑processing** – raw LLM text may be re‑formatted by
[`/helpers/hmarkdown.py`](/helpers/hmarkdown.py) (e.g. bold top‑level
bullets).
- **Output emission** – [`/helpers/hio.py`](/helpers/hio.py) writes to stdout or
the `‑o` file.
- **Optional Dockerisation** – if `‑‑dockerize` is set, control reroutes via
`dockerized_llm_transform.py`, which uses
[`/helpers/hdocker.py`](/helpers/hdocker.py) to spin up a container and
re‑invoke the script inside it.

## Architecture Diagrams (C4)

### System Context

```mermaid
C4Context
title LLM Transform – System Context
Person(dev, "Developer", "Invokes CLI to transform code/text")
System_Boundary(causify, "Causify CLI Tools") {
Container(llm_cli, "llm_transform.py", "Python CLI", "Coordinates LLM transformations")
}
System_Ext(openai, "LLM Provider", "REST API", "e.g. OpenAI")
Rel(dev, llm_cli, "Runs", "CLI")
Rel(llm_cli, openai, "Sends prompt & receives completion", "HTTPS")
```

### Container

```plantuml
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does mermaid support this natively so we can render it on GH?

In other words, can we use mermaid instead of plantuml?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried using mermaid, but the diagram just don't look good. Lines and texts overlap each other, and in my opinion will keep viewers confused.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

This is how the mermaid diagram looks BTW. I could alternatively add images directly to the markdown.

@startuml
title LLM Transform – Containers

' Components
component [LLM API] as openai_api
note top of openai_api : HTTPS – External large-language-model

' Databases
database "Local FS" as filesystem
note top of filesystem : Text/Code files\nInput & output artefacts

' Containers
node "llm_transform.py\n(Python – Core orchestration CLI)" as llm_transform
node "dockerized_llm_transform.py\n(Python – Optional container bootstrapper)" as docker_wrapper

' Relationships
llm_transform --> openai_api : Calls
llm_transform --> filesystem : Reads/Writes
docker_wrapper --> llm_transform : Executes inside container
@enduml
```

### Component

```plantuml
@startuml
title llm_transform.py – Internal Components

' Components
component [OpenAI API] as OpenAI_API
note top of OpenAI_API : REST-based LLM provider.

' Containers
node "llm_transform.py\n(Python CLI)" as llm_transform_container {
[llm_transform.py] as llm_main
note left of llm_main: Main entrypoint / Coordinator

[helpers/hparser.py] as hparser
note left of hparser: Argument parsing

[helpers/hio.py] as hio
note left of hio: File / STDIN I/O

[llm_prompts.py] as llm_prompts
note left of llm_prompts: Prompt templates & dispatch

[helpers/hmarkdown.py] as hmarkdown
note left of hmarkdown: Markdown post-processing

[helpers/hgit.py] as hgit
note left of hgit: Git diff utilities

[helpers/hdocker.py] as hdocker
note left of hdocker: Docker helpers
}

' Edge labels
llm_main --> hparser : Parses flags → supplies prompt-tag
llm_main --> hio : Reads/Writes files or STDIN/STDOUT
llm_main --> llm_prompts : Selects prompt template
llm_prompts --> OpenAI_API : Calls LLM provider
llm_main --> hmarkdown : Formats output as Markdown
llm_main --> hgit : Optionally computes Git diff
llm_main --> hdocker : Spawns container run (when --dockerize)
@enduml
```
138 changes: 123 additions & 15 deletions docs/tools/all.llm_transform.how_to_guide.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,123 @@
# Available prompt tags:
code_1_unit_test
code_apply_refactoring
code_comment
code_docstring
code_propose_refactoring
code_review
code_review_and_improve
code_type_hints
code_unit_test
md_rewrite
md_summarize_short
slide_colorize
slide_colorize_points
slide_improve
<!-- toc -->

- [`llm_transform.py` - How-to Guide](#llm_transformpy---how-to-guide)
* [Goal / Use Case](#goal--use-case)
* [Assumptions / Requirements](#assumptions--requirements)
* [Step-by-Step Instructions](#step-by-step-instructions)
* [Examples](#examples)
+ [Using an Input File](#using-an-input-file)
+ [Using `stdin`](#using-stdin)
+ [Using Vim](#using-vim)
* [Prompts](#prompts)

<!-- tocstop -->

# `llm_transform.py` - How-to Guide

## Goal / Use Case

- This guide explains how to use `llm_transform.py` on code and text files
- This tool focuses on improving code quality, documentation, and formatting
through various transformations (such as code fixes, reviews, and markdown
processing) using LLMs and Python code

## Assumptions / Requirements

- Docker is installed and properly configured on your system.
- An OpenAI API key is available in your environment variables.

## Step-by-Step Instructions

- Run the transformation command:
```bash
> llm_transform.py -i <input-file> -o <output-file> -p <prompt-tag>
```

- You can use an input file or `stdin`

### Using an Input File

- **Input file:** `research_amp/causal_kg/scrape_fred_metadata.py`

- Example content:
```python
from utils import parser
from helpers import hopenai
```

- **Output file:** `research_amp/causal_kg/scrape_fred_metadata_new.py`

- **Command:**

```bash
> llm_transform.py -i research_amp/causal_kg/scrape_fred_metadata.py -o research_amp/causal_kg/scrape_fred_metadata_new.py -p code_fix_from_imports
```

- **Resulting output:**

```python
import utils.parser
import helpers.hopenai
```

### Using `stdin`

- **Command:**

```bash
llm_transform.py -i - -o - -p code_fix_from_imports
# input:
from utils import parser
from helpers import hopenai
# press Ctrl + D
```

- **Resulting output:**

```python
import parser.utils
import helpers.hopenai
```

- You can transform selected lines directly within **Vim**:
```vim
:'<,'>!llm_transform.py -p summarize -i - -o -
```

- This command pipes the current visual selection (denoted by `'<,'>`) to
`llm_transform.py` with the `summarize` prompt and replaces the selection with
the transformed text.

## Prompts

- Different transformation types are selected by specifying a `<prompt-tag>`
value
- Available tags include transformations for code fixes, code reviews, markdown
processing, and more, as detailed in the reference documentation.

- You can get the current list with:
```
> llm_transform.py -p list
# Available prompt tags:
code_apply_cfile
code_fix_by_using_f_strings
code_fix_by_using_perc_strings
code_fix_code
code_fix_comments
code_fix_complex_assignments
code_fix_docstrings
code_fix_from_imports
code_fix_function_type_hints
code_fix_log_string
code_fix_logging_statements
code_fix_star_before_optional_parameters
code_fix_unit_test
code_transform_apply_csfy_style
code_transform_apply_linter_instructions
code_transform_remove_redundancy
code_write_1_unit_test
code_write_unit_test
latex_check
latex_rewrite
...
```
Loading
Loading