Skip to content

Troubleshooting

Mayank Mishra edited this page Jun 5, 2014 · 1 revision

Problem Identification

Sometimes the most obvious reasons are overlooked, as you begin your problem identification; please ensure that following basic checks are looked at:

Verify the system and environment requirements System and environment requirements are one of the most overlooked causes for failure and can also lead to performance issues. Please take the requirements into consideration while deploying Jumbune. Further information on the requirements can be obtained from the Jumbune Installation Guide.

Collecting information It is helpful to collect & review the necessary information. You can find that information to drill down closest to the issue/cause from the following places:

  • Examining logs ( Jumbune deployment and Jumbune Agent )
  • Check application Environment variables, System Configuration.

Troubleshooting Tools There are several tools that can be used to collect information for troubleshooting purpose. Some of the commonly available tools you can use are:

  • JConsole for JMX monitoring.
  • Operating system commands for examining the system state.
  • Jhat for examining the thread and heap dumps.

Commonly Faced Issues

Deployment Issues

Lib directory is not getting created inside the $JUMBUNE_HOME folder. If the distribution jar is placed in the JUMBUNE_HOME directory and the deployment is triggered from the same $JUMBUNE_HOME directory, the $JUMBUNE_HOME/lib directory gets deleted. This is a known bug. To avoid this, please ensure the distributed jar is not placed & triggered from within the $JUMBUNE_HOME directory.

YAML Validation issues

  • Response of the YAML form validation doesn’t come after a long time: These are the probable reason for not getting any response from server while validating YAML :

    • Hadoop daemons are not started, check hadoop daemons by executing jps command.
    • Agent is not running.
    • Agent port is not specified correctly.
  • “Jar not present” exception traced in logs: Below are the probable reasons for this exception,

    • Specified jar path for Jumbune job is incorrect.
    • Yaml form does not retain the uploaded Jumbune job jar path while reusing of yaml. Kindly use browse option each time when runing Jumbune job from Local repository.

Module Issues

  • Debugger and Profiler are not running, all configurations seem to be appropriate and data validation and cluster monitor modules are running fine: Below are the most common reason for this, please use these steps to resolve this problem: Machine name to IP address mapping is missing or is in incorrect form:

    • If ‘Class not found’ exception shown in logs then, run command echo JUMBUNE_AGENT to ensure that a path separator is added at the end of the path.
    • If ‘Path already exist’ exception shown in the logs then, the output directory on the HDFS might already be present. Kindly provide a non-existent path.
  • Debugger logs are not getting created at the JUMBUNE_HOME/jobJars</job_name>/logs directory. Please check you have installed “zip utility” in every node of the cluster.

  • Cluster monitoring not working, notifying a “Failed Job Error”. Please clear browser cache, this is a known issue.

  • “Currently executing a job” error message display on to the UI without launching any job When Jumbune is run from multiple browsers, then if the multiple submitted YAML have been successfully validated at the same time. There is a possibility of race condition and in that case multiple Jumbune jobs will be attempted to run. Since only one profiling job can be executed at an instance, other uses will get the above message. Please retry running profiling job once the running job completes.

  • Heap and CPU usage details are not displayed, I am running Jumbune on HDP. This information is currently not available via Hprof for HDP. Hence, this information is not getting displayed on UI.

Clone this wiki locally