Skip to content

Expose Stats within Spark #1219

Open
Open
@nerdynick

Description

@nerdynick

What kind an issue is this?

  • Bug report. If you’ve found a bug, please provide a code snippet or test to reproduce it below.
    The easier it is to track down the bug, the faster it is solved.
  • Feature Request. Start by telling us what problem you’re trying to solve.
    Often a solution already exists! Don’t send pull requests to implement new features without
    first getting our support. Sometimes we leave features out on purpose to keep the project small.

Feature description

It'd be nice to be able to get ahold of the Stats object post write within the Spark pipeline. This would allow diagnosis of pipeline performance issues from not just the ES Server side but also on the Client side.

I think the easiest and simplest method would be to provide the Stats as a new Partition/RDD after the writes have all been preformed. This would allow the user to control how the stats are published respective of their platform. This would involve changes with the EsRDDWriter to record the stats post write. As well as the EsSpark and JavaEsSpark classes to return the data as apposed to returning nothing.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions