Skip to content

Processing Errors are now Stored in the ElasticSearch Index #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 24 commits into from

Conversation

johannesduesing
Copy link

Reason for this PR
As decribed in #31, the Delphi Crawler does not keep track of any errors that may occur during processing of artifacts. If for example the JAR download fails for any reason, the respective artifact will not be stored in the ElasticSearch index, and there is no way of retrying the download at a later point in time.

Changes in this PR (changes relative to #47, needs to be merged first!)
The MavenDiscoveryProcess now accounts for different types of processing errors. These errors are caught and redirected to a dedicated actor, which stores them as a new mapping type (MavenProcessingError) in the existing ElasticSearch index named delphi. The initial database setup has been adapted so that the new mapping type is created on startup (for fresh DB instances only!). A MavenProcessingError has four properties:

  • type: Either PomDownloadFailed, JarDownloadFailed, PomParsingFailed or HermesProcessingFailed (It must be noted that, unlike in previous versions of the crawler, a failed JAR download is no longer an error if the POM file's packaging attribute is not set to jar)
  • occurred: Datetime at which the error occurred
  • identifier: Maven identifier referencing the artifact for which the error occurred
  • message: An error message that describes the error cause

Closes #31.

Sample ElasticSearch Request
GET <elasticsearch>/delphi/error/_search

"hits":[
         {
            "_index":"delphi",
            "_type":"error",
            "_id":"1603115282027",
            "_score":1.0,
            "_source":{
               "identifier":{
                  "groupId":"yan",
                  "artifactId":"yan",
                  "version":"5.0"
               },
               "occurred":"2020-10-19T15:48:02.027+02:00",
               "message":"Got an unexpected HTTP response, code 404 Not Found.",
               "type":"PomDownloadFailed"
            }
         },
         {
            "_index":"delphi",
            "_type":"error",
            "_id":"1603115325347",
            "_score":1.0,
            "_source":{
               "identifier":{
                  "groupId":"yan",
                  "artifactId":"jfunutil",
                  "version":"5.0.2"
               },
               "occurred":"2020-10-19T15:48:45.347+02:00",
               "message":"Got an unexpected HTTP response, code 404 Not Found.",
               "type":"PomDownloadFailed"
            }
         },[...]
]

Johannes Düsing and others added 24 commits July 9, 2020 15:04
…g, but no parent processing yet. Also no storage yet
…the whole parent hierarchy. Parents are only downloaded once, however, currently for every POM, not on-demand.
…least one version / attribute failed to resolve locally. However, if any parent is required the whole hierarchy will be downloaded! Fixed a bug in test shutdown.
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 0 Security Hotspots to review)
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@johannesduesing johannesduesing added this to the 0.9.6 milestone Oct 20, 2020
@johannesduesing
Copy link
Author

Closed as this functionality is now part of the redesign proposed in #50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant