-
-
Notifications
You must be signed in to change notification settings - Fork 516
Elasticsearch setup guide
This guide will walk you through setting up an Elasticsearch production instance on Linux (please note that we assume that you have some Linux experience). You can use this guide to setup and configure your own nodes on Azure. We HIGHLY recommend that you use the Elasticsearch Azure ARM templates to setup a cluster in azure. This ensures that you are following best practices from the start
We'd love your feedback on how we can improve our setup and configuration. It would be greatly appreciated if a docker guru could create some docker images based on the following tutorial :).
Let's start by creating a new virtual machine and select the latest 64bit Ubuntu Operating System. After your up and running lets ensure it's running the latest software:
sudo apt-get update
sudo apt-get upgradesudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
java -versionwget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-5.x.list
sudo apt-get update && sudo apt-get install elasticsearchFor more information on running Elasticsearch as a service (using SystemD) please read this.
Elasticsearch can run in a docker container. The official elasticsearch repository is located at docker hub. Follow the given instructions on docker hub (elasticsearch.yml has to be copied to /usr/share/elasticsearch/config directory in container) or use this docker-compose.yml sample:
version: '2'
services:
elastic:
image: elasticsearch:latest
restart: always
volumes:
- [DIRECTORY]/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
- [DIRECTORY]/data:/usr/share/elasticsearch/data
ports:
- 9200:9200
- 9300:9300where [DIRECTORY] is directory on the host and should contain the elasticsearch.yml configuration file. Start the container using
sudo docker-compose up -dWhen running Elasticsearch in a docker container the steps below have to be modified appropriately.
You will want to attach a secondary hard disk/storage to your virtual machine before continuing. We use this disk to store the elastic search indexes. We create the largest one possible in azure as we only pay for space that is actually used. A plus side of doing this is you only have to pay for what is actually allocated on disk.
Get a list of the attached SCSI devices.
dmesg | grep SCSIMake sure it’s sdc and that we are formatting the correct one.
sudo fdisk /dev/sdcCommand n then p and all defaults then w to write it
sudo mkfs -t ext4 /dev/sdc1Mount the new drive to /mnt/data
sudo mkdir /mnt/data
sudo mount /dev/sdc1 /mnt/dataAuto mount the drive on reboot.
sudo -i blkidGrab the GUID for /dev/sdc1 and open fstab.
sudo nano /etc/fstabPaste in under the existing UUID:
UUID=YOUR_GUID /mnt/data ext4 defaults 0 0Create the storage folders by creating a db, log and work directory in /mnt/data
cd /mnt/data
mkdir db
mkdir logMake elasticsearch user the owner of the folders
sudo chown -R elasticsearch:elasticsearch /mnt/data/
sudo chown -R elasticsearch:elasticsearch /mnt/data/log
sudo chown -R elasticsearch:elasticsearch /mnt/data/db- Install the
mapper-sizeplugin.
cd /usr/share/elasticsearch
sudo bin/elasticsearch-plugin install mapper-size- If on azure, we should also install the Cloud Azure plugin
cd /usr/share/elasticsearch
sudo bin/elasticsearch-plugin install repository-azureIt's important that you decide early on roughly how many nodes and how much ram the nodes will have so you can configure it properly. It's recommend that you at least three nodes with two master nodes. Having lots of ram and faster storage will help greatly.
Update the Elasticsearch configuration. We have our configuration file located here:
sudo nano /etc/elasticsearch/elasticsearch.ymlEdit the environment config and set ES_HEAP_SIZE to half of the ram size:
sudo nano /etc/default/elasticsearchSet MAX_LOCKED_MEMORY=unlimited
sudo nano /etc/init.d/elasticsearchUpdate system limits
sudo nano /etc/security/limits.confWith these values
elasticsearch - nofile 65535
elasticsearch - memlock unlimitedUpdate SystemD configuration settings
sudo nano /usr/lib/systemd/system/elasticsearch.serviceWith these values
LimitMEMLOCK=infinityRestart the service to ensure the configuration is picked up
sudo /bin/systemctl restart elasticsearch
Finally, lets verify that mlockall is true and maxfiles is 65535.
curl http://localhost:9200/_nodes/process?prettyEnsure Elasticsearch starts after reboot via SystemD:
sudo /bin/systemctl daemon-reload
sudo /bin/systemctl enable elasticsearch.serviceThis section assumes that you've configured the Cloud-Azure plugin in the previous configuration step with your Azure blob storage access keys.
We'll create a new snapshot repository. You'll need to follow this step as well if you wish to restore production data to a secondary cluster.
PUT _snapshot/ex_organizations
{
"type": "azure",
"settings": {
"base_path": "organizations"
}
}
PUT _snapshot/ex_stacks
{
"type": "azure",
"settings": {
"base_path": "stacks"
}
}
PUT _snapshot/ex_events
{
"type": "azure",
"settings": {
"base_path": "events"
}
}
To create a backup and view the status of a snapshot:
GET _snapshot/ex_stacks/_status
PUT /_snapshot/ex_stacks/2020-01-01-12-00
{
"indices": "stacks*",
"ignore_unavailable": "true"
}
GET _snapshot/ex_events/_status
PUT /_snapshot/ex_events/2020-01-01-12-00
{
"indices": "events*",
"ignore_unavailable": "true"
}
GET _snapshot/ex_organizations/_status
PUT /_snapshot/ex_organizations/2020-01-01-12-00
{
"indices": "organizations*",
"ignore_unavailable": "true"
}
Once the snapshot repositories are registered, the management of the snapshots will be handled by the out of process exceptionless jobs.
You'll first want to setup the snapshot repositories as well as install and configure the Cloud-Azure plugin before restoring to a new cluster.
List of all snapshots:
GET _snapshot/ex_stacks/_all
GET _snapshot/ex_events/_all
GET _snapshot/ex_organizations/_all
To do a restore of all indices run the following command (please take a look at the Elasticsearch documentation on how to restore a single index):
POST _snapshot/ex_organizations/2015-12-01-12-30/_restore
{
"include_global_state": false
}