Skip to content

Comments

Add relevant 26.02 docs to r1.1.0#1493

Merged
sarahyurick merged 9 commits intor1.1.0from
sarahyurick/docs
Feb 12, 2026
Merged

Add relevant 26.02 docs to r1.1.0#1493
sarahyurick merged 9 commits intor1.1.0from
sarahyurick/docs

Conversation

@sarahyurick
Copy link
Contributor

@sarahyurick sarahyurick commented Feb 11, 2026

Notes:

  1. We do not include [benchmark] Add Video Benchmarks #1430 in r1.1.0
  • r1.1.0 uses output-clip-path
  • main uses output-path
  1. We do not include Benchmarking script for image pipeline #1441 in r1.1.0
  • r1.1.0 uses batch_size
  • main uses dali_batch_size

Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 11, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
Signed-off-by: Sarah Yurick <sarahyurick@gmail.com>
@sarahyurick sarahyurick marked this pull request as ready for review February 12, 2026 00:08
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 12, 2026

Greptile Overview

Greptile Summary

Backports 26.02 release documentation from main to r1.1.0 branch. Updates version identifiers (25.09 → 26.02) across configuration files and adds comprehensive new documentation for synthetic data generation (Nemotron-CC pipelines, LLM client configuration, multilingual Q&A). Enhances existing documentation with improved examples, clarifications, and structural reorganization.

Major additions:

  • New synthetic data generation guides (docs/curate-text/synthetic/*) covering LLM clients, multilingual Q&A, and advanced Nemotron-CC pipelines
  • Enhanced video deduplication documentation with SemanticDeduplicationWorkflow examples
  • Expanded installation guide with FFmpeg setup, CUDA 12 requirements, and container recommendations
  • Updated release notes restructured for 26.02 release

Key improvements:

  • Reorganized table of contents (Setup & Deployment moved before content sections)
  • Added broken link false positive for HuggingFace token URL
  • Expanded heuristic filter and distributed classifier documentation
  • Improved tutorial README with command-line references and usage examples

The PR description explicitly notes intentional version-specific exclusions (#1430 uses output-clip-path in r1.1.0 vs output-path in main; #1441 uses batch_size vs dali_batch_size), ensuring proper compatibility with the r1.1.0 codebase.

Confidence Score: 5/5

  • This PR is safe to merge with no risk - it contains only documentation updates with no code changes
  • Documentation-only PR with 44 markdown/config files updated. Changes are well-scoped backports from main to r1.1.0 branch with clear version management (25.09 → 26.02). The PR author has documented intentional exclusions to maintain compatibility with r1.1.0 codebase. No code logic changes, no security concerns, and all updates follow established documentation patterns.
  • No files require special attention

Important Files Changed

Filename Overview
docs/about/release-notes/index.md Updates release notes from 25.09 to 26.02, restructuring content and adding new feature documentation
docs/conf.py Updates release version from 25.09 to 26.02 and container version string
docs/versions1.json Adds 26.02 version as preferred release in version dropdown configuration
docs/curate-text/synthetic/index.md New comprehensive synthetic data generation guide with overview and getting started sections
docs/curate-text/synthetic/llm-client.md New documentation for LLM client configuration, covering AsyncOpenAIClient setup and performance tuning
docs/curate-text/synthetic/nemotron-cc/index.md New advanced documentation for Nemotron-CC pipelines with architecture diagrams and composable patterns
docs/curate-text/synthetic/nemotron-cc/tasks.md New comprehensive task reference for Nemotron-CC with examples for each task type
docs/admin/installation.md Updates installation guide with CUDA 12 requirements, FFmpeg setup, and container recommendations
docs/curate-video/process-data/dedup.md Major update adding SemanticDeduplicationWorkflow with improved examples and documentation structure
tutorials/synthetic/README.md Expands synthetic tutorial README with Nemotron-CC examples, command-line reference, and usage guides
docs/index.md Reorganizes table of contents, moving Setup & Deployment before content sections and adding Synthetic Data
docs/curate-text/process-data/quality-assessment/heuristic.md Major expansion of heuristic filter documentation with examples and configuration details

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

44 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@sarahyurick sarahyurick merged commit f1e0a2c into r1.1.0 Feb 12, 2026
18 checks passed
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

44 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants