Skip to content

Slow memcpy when using TegraJPEG. #852

Open
@ksze

Description

@ksze

I'm trying to find and fix bottlenecks in libfreenect2.

One of the issues I found is that memcpy'ing the decoded image from TegraFrame.data (which is effectively jpeg_decompress_struct dinfo.jpegTegraMgr->buff[0]) is very slow. We're talking about around 55 ms at full optimization (-O3) using clang++-3.8.

In contrast, if I try to allocate plain char arrays of the same size on the heap, fill them with random data, and then do memcpy, I get around 9.5 ms, which is not great, but still much better.

So something is very wonky about using plain memcpy on that chunk of memory allocated by the hardware accelerated library. There must be a way to get fast access to the memory, otherwise TegraJPEG seems pretty pointless (the slow read access would defeat the purpose of the fast decode). I hope somebody familiar with the TegraJPEG/gstreamer stuff can pick up and work on this issue.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions