Atlas中的所有配置都使用java属性样式配置。主配置文件是atlas-application.properties
,它位于部署路径的conf
目录中。它由以下部分组成:
设置以下属性以将JanusGraph配置为使用HBase作为持久性引擎。有关详细信息,请参阅这里。
atlas.graph.storage.backend=hbase
atlas.graph.storage.hostname=<ZooKeeper Quorum>
atlas.graph.storage.hbase.table=atlas
如果需要任何进一步的JanusGraph配置,请在属性名称前加上“atlas.graph。”。
除了设置配置外,请确保将环境变量HBASE_CONF_DIR
设置为指向包含HBase配置文件hbase-site.xml
的目录。
ATLAS需要索引搜索引擎。此搜索引擎与ATLAS服务器和存储后端分开运行。目前仅支持两个搜索引擎:Solr
和Elasticsearch
。选择最适合您环境的搜索引擎,并按照下面的配置说明操作。
在云模式下安装Solr是Apache Atlas使用的先决条件。设置以下属性以配置JanusGraph以使用Solr作为索引搜索引擎。
atlas.graph.index.search.backend=solr5
atlas.graph.index.search.solr.mode=cloud
atlas.graph.index.search.solr.wait-searcher=true
# ZK quorum setup for solr as comma separated value. Example: 10.1.6.4:2181,10.1.6.5:2181
atlas.graph.index.search.solr.zookeeper-url=
# SolrCloud Zookeeper Connection Timeout. Default value is 60000 ms
atlas.graph.index.search.solr.zookeeper-connect-timeout=60000
# SolrCloud Zookeeper Session Timeout. Default value is 60000 ms
atlas.graph.index.search.solr.zookeeper-session-timeout=60000
Elasticsearch是Apache Atlas使用的先决条件。设置以下属性以配置JanusGraph以使用Elasticsearch作为索引搜索引擎。
atlas.graph.index.search.backend=elasticsearch
atlas.graph.index.search.hostname=<hostname(s) of the Elasticsearch master nodes comma separated>
atlas.graph.index.search.elasticsearch.client-only=true
搜索API(DSL,基本搜索,全文搜索)支持分页,并具有可选的limit和offset参数。以下配置与搜索分页有关:
# Default limit used when limit is not specified in API
atlas.search.defaultlimit=100
# Maximum limit allowed in API. Limits maximum results that can be fetched to make sure the atlas server doesn't run out of memory
atlas.search.maxlimit=10000
有关Kafka配置,请参阅http://kafka.apache.org/documentation.html#configuration。所有Kafka配置都应该以'atlas.kafka'为前缀。
atlas.kafka.auto.commit.enable=false
# Kafka servers. Example: localhost:6667
atlas.kafka.bootstrap.servers=
atlas.kafka.hook.group.id=atlas
# Zookeeper connect URL for Kafka. Example: localhost:2181
atlas.kafka.zookeeper.connect=
atlas.kafka.zookeeper.connection.timeout.ms=30000
atlas.kafka.zookeeper.session.timeout.ms=60000
atlas.kafka.zookeeper.sync.time.ms=20
# Setup the following configurations only in test deployments where Kafka is started within Atlas in embedded mode
# atlas.notification.embedded=true
# atlas.kafka.data=${sys:atlas.home}/data/kafka
# Setup the following two properties if Kafka is running in Kerberized mode.
# atlas.notification.kafka.service.principal=kafka/[email protected]
# atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab
atlas.client.readTimeoutMSecs=60000
atlas.client.connectTimeoutMSecs=60000
# URL to access Atlas server. For example: http://localhost:21000
atlas.rest.address=
以下属性用于打开SSL功能。
atlas.enableTLS=false
以下属性描述了与高可用性相关的配置选项:
# Set the following property to true, to enable High Availability. Default = false.
atlas.server.ha.enabled=true
# Specify the list of Atlas instances
atlas.server.ids=id1,id2
# For each instance defined above, define the host and port on which Atlas server listens.
atlas.server.address.id1=host1.company.com:21000
atlas.server.address.id2=host2.company.com:31000
# Specify Zookeeper properties needed for HA.
# Specify the list of services running Zookeeper servers as a comma separated list.
atlas.server.ha.zookeeper.connect=zk1.company.com:2181,zk2.company.com:2181,zk3.company.com:2181
# Specify how many times should connection try to be established with a Zookeeper cluster, in case of any connection issues.
atlas.server.ha.zookeeper.num.retries=3
# Specify how much time should the server wait before attempting connections to Zookeeper, in case of any connection issues.
atlas.server.ha.zookeeper.retry.sleeptime.ms=1000
# Specify how long a session to Zookeeper should last without inactiviy to be deemed as unreachable.
atlas.server.ha.zookeeper.session.timeout.ms=20000
# Specify the scheme and the identity to be used for setting up ACLs on nodes created in Zookeeper for HA.
# The format of these options is <scheme>:<identity>. For more information refer to http://zookeeper.apache.org/doc/r3.2.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl.
# The 'acl' option allows to specify a scheme, identity pair to setup an ACL for.
atlas.server.ha.zookeeper.acl=sasl:[email protected]
# The 'auth' option specifies the authentication that should be used for connecting to Zookeeper.
atlas.server.ha.zookeeper.auth=sasl:[email protected]
# Since Zookeeper is a shared service that is typically used by many components,
# it is preferable for each component to set its znodes under a namespace.
# Specify the namespace under which the znodes should be written. Default = /apache_atlas
atlas.server.ha.zookeeper.zkroot=/apache_atlas
# Specify number of times a client should retry with an instance before selecting another active instance, or failing an operation.
atlas.client.ha.retries=4
# Specify interval between retries for a client.
atlas.client.ha.sleep.interval.ms=5000
# Set the following property to true, to enable the setup steps to run on each server start. Default = false.
atlas.server.run.setup.on.start=false
在特定情况下,可以使用以下属性来调整Atlas的性能:
# The number of times Atlas code tries to acquire a lock (to ensure consistency) while committing a transaction.
# This should be related to the amount of concurrency expected to be supported by the server. For e.g. with retries set to 10, upto 100 threads can concurrently create types in the Atlas system.
# If this is set to a low value (default is 3), concurrent operations might fail with a PermanentLockingException.
atlas.graph.storage.lock.retries=10
# Milliseconds to wait before evicting a cached entry. This should be > atlas.graph.storage.lock.wait-time x atlas.graph.storage.lock.retries
# If this is set to a low value (default is 10000), warnings on transactions taking too long will occur in the Atlas application log.
atlas.graph.storage.cache.db-cache-time=120000
# Minimum number of threads in the atlas web server
atlas.webserver.minthreads=10
# Maximum number of threads in the atlas web server
atlas.webserver.maxthreads=100
# Keepalive time in secs for the thread pool of the atlas web server
atlas.webserver.keepalivetimesecs=60
# Queue size for the requests(when max threads are busy) for the atlas web server
atlas.webserver.queuesize=100
要为各种Atlas操作(如REST API调用,通知处理)启用性能日志,请在atlas-log4j.xml
中设置以下内容:
<appender name="perf_appender" class="org.apache.log4j.DailyRollingFileAppender">
<param name="File" value="/var/log/atlas/atlas_perf.log"/>
<param name="datePattern" value="'.'yyyy-MM-dd"/>
<param name="append" value="true"/>
<layout class="org.apache.log4j.PatternLayout">
<param name="ConversionPattern" value="%d|%t|%m%n"/>
</layout>
</appender>
<logger name="org.apache.atlas.perf" additivity="false">
<level value="debug"/>
<appender-ref ref="perf_appender"/>
</logger>