Serf long shot plan system by rjurney · Pull Request #16 · Graphlet-AI/serf

rjurney · 2026-03-08T07:25:47Z

Implement the complete SERF entity resolution system and establish benchmark baselines to fulfill the long-shot plan and prepare for PyPI.

The system includes conversion to uv, Ruff integration, core modules for blocking, matching, merging, evaluation, analysis, edge resolution, Spark integration, DSPy agents, and a CLI. Baseline F1 scores were established on DBLP-ACM, Abt-Buy, and DBLP-Scholar datasets using embedding similarity.

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…te config and module structure Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…ipeline with tests Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…ger with tests Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…tion with tests Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…s with tests Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…, download commands Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…s, fix text column detection Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

cursor · 2026-03-08T07:25:49Z

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
_{Learn more about Cursor Agents}

gemini-code-assist · 2026-03-08T07:27:08Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the complete initial implementation of the Semantic Entity Resolution Framework (SERF). It establishes the core architecture for agentic entity resolution, leveraging modern Python tooling and AI frameworks. The changes enable comprehensive data processing, intelligent matching, and robust evaluation, laying the groundwork for future enhancements and PyPI publication.

Highlights

Dependency Management Migration: Migrated the project's dependency management and build system from Poetry to uv and Hatchling, updating pyproject.toml and related configuration files.
Linting and Formatting Tooling: Replaced black, isort, and flake8 with Ruff for all linting and formatting, streamlining the pre-commit hooks and development workflow.
Core Entity Resolution Modules: Implemented foundational modules for the Semantic Entity Resolution Framework (SERF), including semantic blocking (embeddings, FAISS), LLM-based matching and merging, evaluation metrics, dataset analysis, and edge resolution.
DSPy Agent Integration: Integrated DSPy agents for dynamic orchestration of the ER pipeline, enabling intelligent control over blocking parameters, matching strategies, and convergence detection.
Comprehensive CLI and Benchmarking: Introduced a robust command-line interface (CLI) with commands for analyzing datasets, performing blocking, matching, evaluation, and running full ER pipelines, alongside a system for benchmarking against standard datasets.
Spark Integration: Added initial Spark integration components, including Pydantic-to-Spark schema conversion, graph algorithms for connected components, and Iceberg table read/write functionalities.
Project Documentation and Licensing: Updated README.md with detailed architecture, quick start guides, and benchmark results. Added an Apache License 2.0 file and a comprehensive BUILD_PLAN.md.

Changelog

.gitignore
- Added new ignore rules for Python build artifacts, virtual environments, uv lock files, and common IDE directories.
.pre-commit-config.yaml
- Replaced black, flake8, and isort hooks with ruff for linting and formatting.
- Updated zuban hook to use uv run for execution.
CLAUDE.md
- Updated development instructions to reflect the migration to uv and ruff.
- Replaced references to BAML with DSPy for LLM-related code guidance.
- Removed the section on BAML Client Generation.
LICENSE
- Added the Apache License 2.0 for the project.
README.md
- Updated the project title and added license and Python version badges.
- Rewrote the 'Features' section to detail the agentic ER phases.
- Updated the 'Architecture' table to reflect new tooling like uv, PySpark 4.x, DSPy 3.x, Qwen3-Embedding, FAISS, and Ruff.
- Revised 'Quick Start' and 'Development' sections to use uv and ruff commands.
- Added 'Benchmark Results' and 'Project Structure' sections.
- Updated 'References' and 'License' information.
config.yml
- Expanded configuration with new sections for models (embedding, LLM, temperature), er (blocking, matching, evaluation, paths), and benchmarks (output directory, dataset definitions).
docs/BUILD_PLAN.md
- Added a detailed build plan outlining the phased implementation of SERF, from infrastructure setup to PyPI preparation.
pyproject.toml
- Migrated the project's build system from Poetry to Hatchling.
- Updated project metadata including license, authors, classifiers, and URLs.
- Revised dependencies to include dspy-ai, click, pyyaml, pyspark, sentence-transformers, faiss-cpu, cleanco, tqdm, numpy, pandas.
- Updated development dependencies to pytest, pytest-asyncio, ruff, zuban, pre-commit, types-pyyaml.
- Configured ruff for linting and formatting rules.
scripts/generate_benchmark_data.py
- Added a new script to generate synthetic benchmark datasets (DBLP-ACM, Walmart-Amazon, DBLP-Scholar) in DeepMatcher format for testing.
scripts/run_benchmarks.py
- Added a new script to execute the SERF pipeline on benchmark datasets, including embedding, FAISS blocking, and evaluation.
src/serf/analyze/init.py
- Added the __init__.py file to define the analyze module and expose DatasetProfiler and detect_field_type.
src/serf/analyze/field_detection.py
- Added a new module with detect_field_type function for inferring data types based on field names and values.
src/serf/analyze/profiler.py
- Added a new DatasetProfiler class to analyze dataset characteristics, including completeness, uniqueness, and recommended ER fields.
src/serf/block/embeddings.py
- Added a new EntityEmbedder class for generating entity embeddings using sentence-transformers with device auto-detection.
src/serf/block/faiss_blocker.py
- Added a new FAISSBlocker class for clustering entity embeddings into blocks using FAISS IndexIVFFlat with auto-scaling capabilities.
src/serf/block/normalize.py
- Added a new module for various name normalization utilities, including corporate suffix removal, acronym generation, and domain suffix stripping.
src/serf/block/pipeline.py
- Added a new SemanticBlockingPipeline class to orchestrate the embedding, clustering, and splitting of entity blocks.
src/serf/cli/main.py
- Expanded the CLI with new commands for analyze, block, match, eval, edges, resolve (full pipeline), benchmark, and download.
- Updated existing command implementations to integrate with the new SERF modules.
src/serf/config.py
- Refined type hints in the Config class, replacing Optional and Union with native Python type union syntax.
- Updated exception handling to use from err for better traceback clarity.
src/serf/dspy/agents.py
- Added a new ERAgent class that uses DSPy ReAct to control the entity resolution pipeline dynamically.
src/serf/dspy/baml_adapter.py
- Modified format_field_structure to correctly iterate over input fields.
src/serf/dspy/signatures.py
- Added new DSPy signatures (BlockMatch, EntityMerge, EdgeResolve, AnalyzeDataset) to define LLM input/output contracts for ER tasks.
src/serf/dspy/type_generator.py
- Added a new module with entity_type_from_spark_schema to dynamically create Pydantic Entity subclasses from Spark schemas.
src/serf/dspy/types.py
- Replaced previous BAML-generated types with new core Pydantic types (Entity, EntityBlock, MatchDecision, BlockResolution, FieldProfile, DatasetProfile, IterationMetrics, BlockingMetrics) for the SERF pipeline.
src/serf/edge/init.py
- Added the __init__.py file to define the edge module and expose EdgeResolver.
src/serf/edge/resolver.py
- Added a new EdgeResolver class for grouping and resolving duplicate edges using an LLM after entity merging.
src/serf/eval/init.py
- Added the __init__.py file to define the eval module.
src/serf/eval/benchmarks.py
- Added a new BenchmarkDataset class for managing, downloading, and converting standard ER benchmark datasets.
src/serf/eval/metrics.py
- Added a new module for calculating standard entity resolution evaluation metrics like precision, recall, F1 score, and reduction ratio.
src/serf/match/init.py
- Added the __init__.py file to define the match module and expose EntityMatcher, UUIDMapper, and few-shot example functions.
src/serf/match/few_shot.py
- Added a new module for generating and formatting few-shot examples to guide LLM matching behavior.
src/serf/match/matcher.py
- Added a new EntityMatcher class to resolve entity blocks using DSPy's BlockMatch signature, including UUID mapping and async processing.
src/serf/match/uuid_mapper.py
- Added a new UUIDMapper class to handle the conversion of entity IDs to integers for LLM processing and their restoration.
src/serf/merge/init.py
- Added the __init__.py file to define the merge module and expose EntityMerger.
src/serf/merge/merger.py
- Added a new EntityMerger class for combining multiple entities into a single canonical record, prioritizing complete field values.
src/serf/spark/init.py
- Added the __init__.py file to define the spark module.
src/serf/spark/graph.py
- Added a new module with Spark graph algorithms, specifically for finding connected components.
src/serf/spark/iceberg.py
- Added a new module for Iceberg integration, providing utilities for SparkSession configuration and table operations.
src/serf/spark/schemas.py
- Added a new module for bridging Pydantic and Spark schemas, including type conversion and schema normalization utilities.
src/serf/spark/utils.py
- Added a new module with shared Spark utilities, such as splitting large blocks and selecting most common properties.
tests/test_agents.py
- Added new tests for the ERAgent class, covering initialization, signature fields, and tool function behavior.
tests/test_benchmarks.py
- Added new tests for the BenchmarkDataset class, verifying dataset availability, creation, evaluation, and entity conversion.
tests/test_cli.py
- Added new tests for the serf command-line interface, checking help messages and basic command functionality.
tests/test_dspy.py
- Updated the lm fixture's type hint and docstring for clarity.
tests/test_edge_resolver.py
- Added new tests for the EdgeResolver class, covering edge grouping and resolution logic.
tests/test_embeddings.py
- Added new tests for the EntityEmbedder class and get_torch_device function.
tests/test_faiss_blocker.py
- Added new tests for the FAISSBlocker class, verifying block creation, auto-scaling, and ID preservation.
tests/test_few_shot.py
- Added new tests for few-shot example generation and formatting functions.
tests/test_field_detection.py
- Added new tests for the detect_field_type function, covering various data types and heuristics.
tests/test_graph.py
- Added new tests for the Spark connected_components function.
tests/test_merger.py
- Added new tests for the EntityMerger class, verifying entity merging logic, source ID/UUID accumulation, and value selection.
tests/test_metrics.py
- Added new tests for entity resolution evaluation metrics, ensuring correct calculation of precision, recall, and F1 score.
tests/test_normalize.py
- Added new tests for name normalization functions, including handling of whitespace, punctuation, unicode, and corporate/domain suffixes.
tests/test_profiler.py
- Added new tests for the DatasetProfiler class, covering dataset analysis and field recommendations.
tests/test_schemas.py
- Added new tests for Spark schema utilities, including Pydantic-to-Spark conversion and schema validation.
tests/test_signatures.py
- Added new tests for DSPy signature definitions, verifying input/output fields and predictor creation.
tests/test_type_generator.py
- Added new tests for entity_type_from_spark_schema and spark_type_to_python functions.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This is a massive and impressive pull request that implements the core SERF entity resolution system. The migration to uv and ruff is a great modernization step. The new modules for blocking, matching, evaluation, and the comprehensive CLI are well-structured and follow good Python practices. My review focuses on a few key areas to improve robustness and maintainability: ensuring the CLI can handle various data inputs without crashing, correcting a potentially problematic assumption in the ID mapping logic, centralizing configuration, and a minor style fix in a script. Overall, this is a fantastic contribution that lays a solid foundation for the project.

gemini-code-assist · 2026-03-08T07:29:26Z

src/serf/cli/main.py

+        ]
+        entities.append(
+            Entity(
+                id=int(row_dict.get("id", idx)),  # type: ignore[arg-type]


The conversion int(row_dict.get("id", idx)) assumes that if an id column exists, its values can be cast to an integer. This will raise a ValueError if the IDs are non-integer strings (e.g., UUIDs), causing the CLI to crash. It would be more robust to handle this potential ValueError.

gemini-code-assist · 2026-03-08T07:29:26Z

src/serf/match/uuid_mapper.py

+        if missing_ids and resolution.resolved_entities:
+            first = resolution.resolved_entities[0]
+            existing_sources = set(first.source_ids or [])
+            first_sources = list(existing_sources | missing_ids)
+            resolution.resolved_entities[0] = first.model_copy(update={"source_ids": first_sources})


The "Phase 1" recovery for missing entities in unmap_block assumes that any entity missing from the LLM's output should be considered merged into the first resolved entity. This is a strong assumption that could lead to incorrect data provenance, as the LLM might have dropped the entity for other reasons (e.g., context length). The "Phase 2" recovery, which re-adds the missing entity and marks it as skipped, is a much safer approach. I recommend removing Phase 1 to avoid incorrect source attribution.

Suggested change

if missing_ids and resolution.resolved_entities:

first = resolution.resolved_entities[0]

existing_sources = set(first.source_ids or [])

first_sources = list(existing_sources | missing_ids)

resolution.resolved_entities[0] = first.model_copy(update={"source_ids": first_sources})

# if missing_ids and resolution.resolved_entities:

# first = resolution.resolved_entities[0]

# existing_sources = set(first.source_ids or [])

# first_sources = list(existing_sources | missing_ids)

# resolution.resolved_entities[0] = first.model_copy(update={"source_ids": first_sources})

gemini-code-assist · 2026-03-08T07:29:26Z

scripts/run_benchmarks.py

+        matched_right = [e for e in right_entities if e.id in gt_right_ids]
+        unmatched_right = [e for e in right_entities if e.id not in gt_right_ids]
+        sample_size = max(0, max_entities - len(matched_right))
+        import random


The import random statement is located inside the run_benchmark function. According to PEP 8, imports should be placed at the top of the file. This improves readability and makes it easier to see the script's dependencies at a glance. Please move this import to the top-level of the script.

gemini-code-assist · 2026-03-08T07:29:26Z

src/serf/eval/benchmarks.py

+DATASET_REGISTRY: dict[str, dict[str, str]] = {
+    "dblp-acm": {
+        "url": "https://dbs.uni-leipzig.de/files/datasets/DBLP-ACM.zip",
+        "table_a_name": "DBLP2.csv",
+        "table_b_name": "ACM.csv",
+        "mapping_name": "DBLP-ACM_perfectMapping.csv",
+        "mapping_col_a": "idDBLP",
+        "mapping_col_b": "idACM",
+        "domain": "bibliographic",
+        "difficulty": "easy",
+    },
+    "dblp-scholar": {
+        "url": "https://dbs.uni-leipzig.de/files/datasets/DBLP-Scholar.zip",
+        "table_a_name": "DBLP1.csv",
+        "table_b_name": "Scholar.csv",
+        "mapping_name": "DBLP-Scholar_perfectMapping.csv",
+        "mapping_col_a": "idDBLP",
+        "mapping_col_b": "idScholar",
+        "domain": "bibliographic",
+        "difficulty": "medium",
+    },
+    "abt-buy": {
+        "url": "https://dbs.uni-leipzig.de/files/datasets/Abt-Buy.zip",
+        "table_a_name": "Abt.csv",
+        "table_b_name": "Buy.csv",
+        "mapping_name": "abt_buy_perfectMapping.csv",
+        "mapping_col_a": "idAbt",
+        "mapping_col_b": "idBuy",
+        "domain": "products",
+        "difficulty": "hard",
+    },
+}


The DATASET_REGISTRY is hardcoded within this file. However, there is a benchmarks.datasets section in config.yml that seems to define the same information. This creates a discrepancy and a maintainability issue, as changes to benchmark datasets would need to be made in two places. To centralize configuration, this registry should be loaded from config.yml.

…r F1=0.90 (embedding baseline) Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…flag, remove scripts/ directory Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…h optional YAML config Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…M for all matching Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…tput flag Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…th validation Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…fix analyze LLM guidance, add auto-convergence Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…arison test, fix analyze config generation Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…ig for agentic override Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…All matching via LLM. Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…hing Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…g (Gemini Flash, 30 concurrent) Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…se 1 recovery, use optional-dependencies, rename _resolve_blocks_with_llm, document FAISS type ignores Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

… block splitting for small targets Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…king Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

- pipeline.py: Assign UUIDs at entity creation, track all historical UUIDs - uuid_mapper.py: Transitive source_ids/source_uuids collection, dedup, exclude self - merger.py: Dedup source_ids/source_uuids, exclude master's own ID/UUID - matcher.py: Add iteration param to resolve_block/resolve_blocks, set match_skip_history - metrics.py: Add validate_source_uuids function - Tests for dedup, self-exclusion, transitive accumulation, and UUID validation Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…analysis, UUID validation, PASS/FAIL checks Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…ss-cpu Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

… FAISS compatibility Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

… processes to fix macOS MPS segfault Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…NING.md from Eridu lessons Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…ngual-e5-base, remove all pip references Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…P=0.885 R=0.581 F1=0.701 Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

…rvice profiles Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

… defenses, validate LLM config output, deduplicate CLI helpers Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

cursoragent and others added 10 commits March 8, 2026 04:34

Add build plan for SERF implementation

21414d2

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Convert from Poetry to uv, replace black/isort/flake8 with Ruff, upda…

61ffbbe

…te config and module structure Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add pipeline types, type generator, and DSPy signatures with tests

a4cfc4b

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add blocking module: embeddings, FAISS blocker, name normalization, p…

ae87c7a

…ipeline with tests Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add matching and merging modules: UUID mapper, matcher, few-shot, mer…

998f491

…ger with tests Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add evaluation metrics, benchmarks, dataset analysis, and edge resolu…

d4c5378

…tion with tests Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add Spark integration (schemas, utils, iceberg, graph) and DSPy agent…

21c8c84

…s with tests Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add complete CLI with analyze, block, match, eval, resolve, benchmark…

9300375

…, download commands Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Update benchmarks to use Leipzig dataset source, add benchmark script…

68ec92b

…s, fix text column detection Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add README, LICENSE, PyPI packaging setup

fb9707f

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

gemini-code-assist bot reviewed Mar 8, 2026

View reviewed changes

cursoragent and others added 17 commits March 8, 2026 16:27

Add benchmark results: DBLP-ACM F1=0.83, Abt-Buy F1=0.46, DBLP-Schola…

81c823e

…r F1=0.90 (embedding baseline) Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Port scripts into CLI: add benchmark-all command, --use-llm/--no-llm …

ae8f8c7

…flag, remove scripts/ directory Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add serf run command for end-to-end ER on any CSV/Parquet/Iceberg wit…

68d1ecf

…h optional YAML config Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Remove embedding-based matching: use embeddings only for blocking, LL…

4070f39

…M for all matching Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Enhance serf analyze to generate LLM-powered ER config YAML with --ou…

de0c251

…tput flag Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add integration tests: LLM-generated ER config from benchmark data wi…

b3aaea1

…th validation Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add Publication/Product types, fix block sizes (target=30, max=100), …

c9b3487

…fix analyze LLM guidance, add auto-convergence Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Fix DSPy threading with dspy.context, add head-to-head benchmark comp…

65d8876

…arison test, fix analyze config generation Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Default to name-only embedding for blocking, add blocking_fields conf…

c09cd83

…ig for agentic override Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add critical rule: embeddings for blocking only, never for matching. …

9685616

…All matching via LLM. Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Set max_tokens=8192 for LLM matcher to prevent output truncation

13b95a1

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add --limit and --concurrency options, tqdm progress bar for LLM matc…

ca166d7

…hing Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Benchmark results: DBLP-ACM P=0.895 R=0.625 F1=0.736 with LLM matchin…

b3f8cb0

…g (Gemini Flash, 30 concurrent) Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Address Gemini code review: fix string ID crash, remove incorrect Pha…

782317c

…se 1 recovery, use optional-dependencies, rename _resolve_blocks_with_llm, document FAISS type ignores Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Auto-scale block size for --limit test runs: target=5 when limit<=20

b079972

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Extract predicted pairs from source_ids as well as matches, fix FAISS…

556d9dd

… block splitting for small targets Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Convert async tests to use pytest-asyncio with @pytest.mark.asyncio

32e7555

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

cursoragent and others added 14 commits March 8, 2026 22:01

Add SCALABILITY.md: vector engine recommendations for beyond-RAM bloc…

e83ec35

…king Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Port Abzu er_eval.py rigor: comprehensive evaluator with dedup, skip …

bc27560

…analysis, UUID validation, PASS/FAIL checks Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Improve install instructions: add pip and conda paths, note about fai…

3d7a3d0

…ss-cpu Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add pyspark-mcp dependency

d2bd6d0

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Fix FAISS segfault on macOS: force CPU encoding, contiguous array for…

fe6d385

… FAISS compatibility Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Subprocess isolation for PyTorch/FAISS: embed and cluster in separate…

23193fb

… processes to fix macOS MPS segfault Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Switch to intfloat/multilingual-e5-large embedding model, add FINE_TU…

490893d

…NING.md from Eridu lessons Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Externalize all model names to config.yml, switch to intfloat/multili…

8a18631

…ngual-e5-base, remove all pip references Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Document: never use pip or uv pip, only uv add/sync/run

18721eb

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Benchmark results with multilingual-e5-base + Gemini Flash: DBLP-ACM …

fb14ea5

…P=0.885 R=0.581 F1=0.701 Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Dockerize: Dockerfile on Ubuntu 24.04 with uv, docker-compose with se…

c2bca43

…rvice profiles Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Address Gemini review round 2: fix ruff version, add prompt injection…

a42d5fa

… defenses, validate LLM config output, deduplicate CLI helpers Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Add QUICKSTART.md: end-to-end guide for using SERF

9bde1a4

Co-authored-by: Russell Jurney <rjurney@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serf long shot plan system#16

Serf long shot plan system#16
rjurney wants to merge 41 commits intorjurney/v0.1from
cursor/serf-long-shot-plan-system-b0d4

rjurney commented Mar 8, 2026

Uh oh!

cursor bot commented Mar 8, 2026

Uh oh!

gemini-code-assist bot commented Mar 8, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Uh oh!

gemini-code-assist bot Mar 8, 2026

Uh oh!

gemini-code-assist bot Mar 8, 2026

Uh oh!

gemini-code-assist bot Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rjurney commented Mar 8, 2026

Uh oh!

cursor bot commented Mar 8, 2026

Uh oh!

gemini-code-assist bot commented Mar 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants