Skip to content

Memory mapped file#5

Open
ryanstocks00 wants to merge 13 commits intomainfrom
mmap-file
Open

Memory mapped file#5
ryanstocks00 wants to merge 13 commits intomainfrom
mmap-file

Conversation

@ryanstocks00
Copy link
Contributor

No description provided.

@codecov
Copy link

codecov bot commented Feb 28, 2026

Codecov Report

❌ Patch coverage is 96.00000% with 24 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/las_reader.hpp 86.30% 10 Missing ⚠️
src/utilities/memory_mapped_file.hpp 88.33% 7 Missing ⚠️
src/utilities/thread_pool.hpp 90.00% 3 Missing ⚠️
src/laz/stream.hpp 88.88% 2 Missing ⚠️
src/laz/laz_reader.hpp 96.42% 1 Missing ⚠️
src/laz/laz_writer.hpp 91.66% 1 Missing ⚠️
Flag Coverage Δ
unittests 92.20% <96.00%> (+0.45%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/laz/integer_encoder.hpp 100.00% <100.00%> (ø)
src/laz/layered_stream.hpp 100.00% <100.00%> (ø)
src/laz/tests/test_point14_encoder.cpp 100.00% <100.00%> (ø)
src/tests/test_reader.cpp 100.00% <100.00%> (ø)
src/utilities/env.hpp 100.00% <100.00%> (ø)
src/utilities/tests/test_memory_mapped_file.cpp 100.00% <100.00%> (ø)
src/laz/laz_reader.hpp 87.50% <96.42%> (+0.90%) ⬆️
src/laz/laz_writer.hpp 90.47% <91.66%> (+0.17%) ⬆️
src/laz/stream.hpp 98.29% <88.88%> (-1.71%) ⬇️
src/utilities/thread_pool.hpp 88.88% <90.00%> (-1.44%) ⬇️
... and 2 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ryanstocks00 ryanstocks00 force-pushed the mmap-file branch 2 times, most recently from 4f40471 to 79f143a Compare February 28, 2026 12:21
@ryanstocks00 ryanstocks00 force-pushed the mmap-file branch 4 times, most recently from f2eead6 to 29e3bac Compare March 1, 2026 03:08
@ryanstocks00 ryanstocks00 force-pushed the mmap-file branch 2 times, most recently from 040a63d to 7441233 Compare March 1, 2026 04:46
@ryanstocks00 ryanstocks00 force-pushed the mmap-file branch 2 times, most recently from a70f8a3 to b3d72ef Compare March 1, 2026 12:36
@ryanstocks00 ryanstocks00 force-pushed the mmap-file branch 2 times, most recently from dca3d38 to 4393826 Compare March 2, 2026 13:10
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a memory-mapped file I/O path for LASReader (with a stream fallback and an env-var override), alongside related performance/benchmarking improvements and LAZ encoder memory/layout refactors.

Changes:

  • Add MemoryMappedFile utility and integrate it into LASReader for path-based construction (with optional LASPP_DISABLE_MMAP override and fallback to stream I/O).
  • Refactor LAZ encoder storage to use std::unique_ptr inside the LAZEncoder variant and add decompression prefetching.
  • Extend benchmarks/tests to cover mmap vs stream paths and add a “roofline” raw file read baseline.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
src/utilities/thread_pool.hpp Lazy thread creation + global pool init refactor used by parallel decompression.
src/utilities/memory_mapped_file.hpp New cross-platform read-only memory-mapped file utility.
src/utilities/tests/test_memory_mapped_file.cpp New standalone tests for MemoryMappedFile behavior and move semantics.
src/utilities/macros.hpp Adds LASPP_PREFETCH macro used by LAZ decompression.
src/las_reader.hpp Adds mmap-based path constructor + unified get_bytes() to support both I/O modes.
src/laz/stream.hpp Makes pointer stream buffer read-only and adds seeking support (used by mmap path).
src/laz/encoders.hpp Changes LAZEncoder to store heap-allocated encoders to shrink variant size.
src/laz/laz_writer.hpp Updates writer logic for LAZEncoder now holding unique_ptrs.
src/laz/laz_reader.hpp Updates reader logic for LAZEncoder now holding unique_ptrs; adds prefetch.
src/laz/layered_stream.hpp Makes layered input stream constructor const-correct for read-only spans.
src/laz/integer_encoder.hpp Adds constructor overloads for symbol encoder sharing/ownership.
src/laz/tests/test_point14_encoder.cpp Const-correct span usage for layered stream inputs.
src/tests/test_reader.cpp Adds coverage for mmap vs stream LASReader paths + disable-mmap env var.
apps/benchmark.cpp Uses path-based LASReader, adds raw read “roofline” benchmark, tweaks logging.
apps/benchmark.py Updates plotting (exclude threads=0 for scaling tools) and adds roofline line.
CMakeLists.txt Disables benchmark-by-default when included as a subproject.
Comments suppressed due to low confidence (2)

src/utilities/thread_pool.hpp:141

  • The worker tasks capture func by value (enqueue([..., func, ...])), which requires Func to be copyable and makes an extra copy per worker. This is an API regression vs capturing by reference (and breaks move-only callables). Consider capturing func by reference (safe because parallel_for waits), or storing it behind a shared_ptr/reference_wrapper.
    // Wait for all work to complete
    while (completed_count->load() < total_work) {
      std::this_thread::yield();
    }
  }

 private:
  void enqueue(std::function<void()> task) {
    {
      std::unique_lock<std::mutex> lock(m_queue_mutex);
      m_tasks.emplace(std::move(task));

src/las_reader.hpp:346

  • read_points() reinterprets raw bytes as PointType and dereferences the pointer. Even for packed/trivial types this relies on implicit object-lifetime rules and can be undefined behavior under strict aliasing/UBSan. A safer approach is to std::memcpy into a local PointType (or std::bit_cast when applicable) and then copy from that object.

  template <typename T>
  std::span<T> read_chunks(std::span<T> output_location, std::pair<size_t, size_t> chunk_indexes) {
    if (header().is_laz_compressed()) {
      size_t compressed_start_offset =

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 122 to 147
@@ -154,31 +143,26 @@ class ThreadPool {
m_condition.notify_one();
}

size_t m_num_threads;
size_t m_max_threads;
std::vector<std::thread> m_workers;
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parallel_for() now uses a tight busy-wait loop (yield() until completed_count == total_work) and increments an atomic for every single index. For large ranges this can significantly increase contention and CPU usage. Consider using a std::latch/condition_variable to block the caller, and batching indices (chunking) to reduce atomic traffic.

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +30
#include <intrin.h>
#define LASPP_PREFETCH(addr) _mm_prefetch(reinterpret_cast<const char*>(addr), _MM_HINT_T0)
#else
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LASPP_PREFETCH on MSVC unconditionally uses _mm_prefetch / _MM_HINT_T0, but this won’t compile on non-x86 Windows targets (e.g., ARM64), and <intrin.h> is not guaranteed to provide the SSE prefetch intrinsics/macros on all configurations. Consider guarding with architecture checks and/or including <xmmintrin.h>/<immintrin.h> when available, with a no-op fallback otherwise.

Suggested change
#include <intrin.h>
#define LASPP_PREFETCH(addr) _mm_prefetch(reinterpret_cast<const char*>(addr), _MM_HINT_T0)
#else
#include <intrin.h>
#if defined(_M_IX86) || defined(_M_X64) || defined(_M_AMD64)
#include <xmmintrin.h>
#define LASPP_PREFETCH(addr) _mm_prefetch(reinterpret_cast<const char*>(addr), _MM_HINT_T0)
#else
#define LASPP_PREFETCH(addr) ((void)0)
#endif
#else

Copilot uses AI. Check for mistakes.
Comment on lines +84 to +93

ReadBuffer get_bytes(size_t offset, size_t size) {
if (m_mapped_file.has_value()) {
return ReadBuffer(m_mapped_file->subspan(offset, size));
}
LASPP_ASSERT(m_input_stream != nullptr, "m_input_stream must be set for stream-based I/O");
std::vector<std::byte> buf(size);
m_input_stream->seekg(static_cast<int64_t>(offset));
LASPP_CHECK_READ(
m_input_stream->read(reinterpret_cast<char*>(buf.data()), static_cast<int64_t>(size)));
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_bytes() seeks with seekg(static_cast<int64_t>(offset)) and then reads without checking whether the seek succeeded. If offset exceeds int64_t or the seek fails, the subsequent read may behave unexpectedly (and LASPP_CHECK_READ is a no-op when LASPP_DEBUG_ASSERTS is off). Consider using std::streamoff/std::ios::beg, checking good() after seekg, and handling conversion/overflow explicitly.

Copilot uses AI. Check for mistakes.
Comment on lines 276 to +289
auto t0 = Clock::now();
{
std::ifstream ifs(path, std::ios::binary);
LASReader reader(ifs);
// Use file path constructor - automatically uses memory mapping for optimal performance
LASReader reader(path);
if (!reader.is_using_memory_mapping()) {
std::cerr << "WARNING: LASReader is NOT using memory mapping (fell back to streams)\n";
} else {
// Only print once per benchmark run, not per iteration
static bool printed = false;
if (!printed) {
std::cerr << "INFO: LASReader is using memory-mapped file I/O\n";
printed = true;
}
}
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark prints to std::cerr inside the timed region. Even printing only once can skew the first timing, and the warning path prints on every iteration if mapping is unavailable. Consider moving this logging outside the timed scope (or gating it behind a flag) so results reflect I/O/decompression performance only.

Copilot uses AI. Check for mistakes.
#endif
}

// Test 7: Verify data() returns correct span
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test numbering skips from “Test 5” to “Test 7”, which makes future additions/triage a bit confusing. Consider renumbering this comment to “Test 6” or adding the missing section.

Suggested change
// Test 7: Verify data() returns correct span
// Test 6: Verify data() returns correct span

Copilot uses AI. Check for mistakes.
Comment on lines +161 to +163
std::max(size_t{1}, static_cast<size_t>(std::thread::hardware_concurrency()));
pool = std::make_unique<ThreadPool>(max_threads);
});
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_queue_mutex does not need to be mutable for ensure_threads() (the method is non-const). Marking it mutable weakens const-correctness for the rest of the class; consider reverting to a non-mutable mutex unless there is a const method that must lock it.

Copilot uses AI. Check for mistakes.

// Explicitly disable write operations to ensure const-correctness
std::streambuf::int_type overflow(std::streambuf::int_type) override {
LASPP_FAIL("PointerStreamBuffer: write operation attempted on read-only buffer");
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PointerStreamBuffer::overflow() uses LASPP_FAIL(...), but when LASPP_DEBUG_ASSERTS is off that macro compiles down to UNREACHABLE() (UB if a write is attempted). Since this override is a standard-library error path, it should return traits_type::eof() (and ideally not rely on debug-only asserts) to keep behavior defined in release builds.

Suggested change
LASPP_FAIL("PointerStreamBuffer: write operation attempted on read-only buffer");
// Signal failure to write on a read-only buffer without invoking undefined behavior.
return traits_type::eof();

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +65
pos_type seekoff(off_type off, std::ios_base::seekdir dir,
std::ios_base::openmode which = std::ios_base::in) override {
if (which & std::ios_base::out) return pos_type(off_type(-1)); // read-only buffer
char* new_gptr;
if (dir == std::ios_base::beg) {
new_gptr = eback() + off;
} else if (dir == std::ios_base::cur) {
new_gptr = gptr() + off;
} else { // std::ios_base::end
new_gptr = egptr() + off;
}
if (new_gptr < eback() || new_gptr > egptr()) return pos_type(off_type(-1));
setg(eback(), new_gptr, egptr());
return pos_type(new_gptr - eback());
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seekoff() does pointer arithmetic like eback() + off before checking bounds. If off is negative (valid for seekg), forming a pointer before eback() is undefined behavior even if you later compare it. Please do bounds checks using integer offsets (e.g., compute base_index/new_index as ptrdiff_t and validate the range) before converting back to pointers.

Copilot uses AI. Check for mistakes.
Comment on lines +204 to +205
std::streamsize size = file.tellg();
file.seekg(0, std::ios::beg);
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raw_file_read_once() does not validate tellg() before using it as a vector size. If tellg() returns -1 (seek/tell failure), static_cast<size_t>(size) becomes huge and will attempt a massive allocation. Please check size >= 0 (and file.good()) before allocating/reading.

Suggested change
std::streamsize size = file.tellg();
file.seekg(0, std::ios::beg);
std::streamsize size = file.tellg();
if (!file.good() || size < 0) {
throw std::runtime_error("Failed to determine file size: " + path.string());
}
file.seekg(0, std::ios::beg);
if (!file.good()) {
throw std::runtime_error("Failed to seek in file: " + path.string());
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants