-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Add a step to exhaustive tests for observabilitySRE accetpance testing #17623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a step to exhaustive tests for observabilitySRE accetpance testing #17623
Conversation
x-pack/build.gradle
Outdated
description = "Run ObservabilitySRE acceptance tests" | ||
// Need to have set up the ruby environment for rspec even through we are running in container | ||
dependsOn(":bootstrap", ":logstash-core:assemble", ":installDevelopmentGems") | ||
// TODO: hook in to rspec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've proved out that on a fips host we can run these tasks. This will give us a consistent java/jruby env for which we can do the standard gradle/rake/rspec control flow.
For the next step in this PR i will be adding in some rspec that shows a pattern for doing container orchestration. At this point i'm thinking that will look like shelling out to docker-compose in rspec.
x-pack/distributions/internal/observabilitySRE/qa/acceptance/docker/docker-compose.yml
Show resolved
Hide resolved
...s/internal/observabilitySRE/qa/acceptance/docker/elasticsearch/config/elasticsearch-fips.yml
Show resolved
Hide resolved
Still very much WIP. I have some cleanup and bugs to track down. Just wanted to get the structure out there breaking down responsibility between gradle/rspec etc and float the idea of changing configuration of components via interpolation in the docker compose file. |
94869d9
to
6c53570
Compare
I've got this all working locally. Specifically the rspec tests will now test that data goes from LS to ES with gradle generating certs and rspec managing container startup/teardown. We are still waiting on unblocking generation of a fips enabled test runner, but I'm happy with the patterns established here breaking down the responsibility between gradle/rspec and docker-compose. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structure looks sensible, and leaves enough hook-points for future variations.
I've left a note about encapsulating the complexity of controlling docker from the hooks. Feel free to resolve that as you see fit (even if that is just acknowledging my nitpick without addressing it).
|
||
context "when running with non-FIPS compliant configuration" do | ||
before(:all) do | ||
system("cd #{__dir__}/../docker && LOGSTASH_PIPELINE=logstash-to-elasticsearch-weak.conf docker-compose up -d") or fail "Failed to start Docker Compose with weak SSL" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since each test context grouping needs to do some variation of this, we can define helper methods to encapsulate the complexity. I also prefer to use the long version of flags in checked-in source (e.g., --detach
and --volumes
instead of -d
and -v
) since it makes it easier to understand the intention.
def docker_compose_up(env={}) = docker_compose_invoke("up --detach", env)
def docker_compose_down(env={}) = docker_compose_invoke("down --volumes", env)
def docker_compose_invoke(subcommand, env={})
env_str = env.map{ |k,v| "#{k.to_s.upcase}=#{Shellwords.escape(v)} "}.join
command = "#{env_str}docker-compose #{subcommand}"
work_dir = Pathname.new("__dir__/../docker").cleanpath
system("cd #{Shellwords.escape(workdir} && #{command}") or fail "Failed to invoke Docker Compose with command `#{command}` in directory `#{work_dir}`"
end
And I think we can use docker-compose
's --project-directory
to set the working directory and avoid the &&
-chaining:
def docker_compose_invoke(subcommand, env={})
env_str = env.map{ |k,v| "#{k.to_s.upcase}=#{Shellwords.escape(v)} "}.join
work_dir = Pathname.new("__dir__/../docker").cleanpath
command = "#{env_str}docker-compose --project-directory=#{Shellwords.escape(work_dir)} #{subcommand}"
system(command) or fail "Failed to invoke Docker Compose with command `#{command}`"
end
But either would make this line look like:
system("cd #{__dir__}/../docker && LOGSTASH_PIPELINE=logstash-to-elasticsearch-weak.conf docker-compose up -d") or fail "Failed to start Docker Compose with weak SSL" | |
docker_compose_up(logstash_pipeline: 'logstash-to-elasticsearch-weak.conf') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestion. Incorporated.
# Generate this message indefinitely to give ES container time to come online | ||
count => -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in theory we shouldn't need to generate more than one, since the ES output is designed to retry its batch of events until all of them have been explicitly rejected by elasticseaerch (e.g., with a successful HTTP 2XX response from the bulk API containing individual rejections).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah i had a wrong assumption earlier. You are right here and actually this will save quite a bit of headache. Thanks!
This commit shows the proposed pattern for adding acceptance testing for the observability SRE image. This will run when exhaustive tests run. A new gradle task will hook in to rspec similar to how it is done for the smoke tests. The main difference is that instead of building a container, the latest is pulled from the container registry and run on a fips configured host VM.
…tests This commit shows the rough structure for how I am planning on handling docker compose networks for acceptance tests. The main idea is to use interpolation in the docker compose file to point to different configuration files for filebeat/logstash/elasticsearch. This is mainly due to the nature of these tests showing behavior when the system is and is not configured properly for FIPS. The breakdown in responsibility is: 1. Gradle handles cert generation (similar to smoke test, this avoids checking in PKI) 2. Rspec handles stopping/starting docker compose and managing environment vars for intperolation in docker compose manifests (different from smoke tests where a single static docker compose is started in gradle) 3. Rspec handles deciding when containers are ready and querying state about data flowing through the system 4. Gradle cleans up certs THis is just a rough sketch, there are still bugs to be worked out but before i get too far in to it I want to get the idea out there.
This commit adds a test to show that data will not flow from LS to ES when weak non fips config is used.
This will be handled separately in a separate PR, but taking this commit for now on this branch.
The latest ES images do not require this workaround.
1. Remove rogue character from test file causing interpreter failure 2. Split out helpers for docker compose orchestration 3. Only send a single message instead of infinite through to ES
a257cad
to
a0b1f8e
Compare
As described in elastic/ingest-dev#5471 this commit adds a test for filebeat sending data through logstash to elasticsearch using fips config.
This test ensures logstash will not accept data from filebeat when using weak tls configuration. See elastic/ingest-dev#5472
d8b1980
to
d39a080
Compare
Kicked off https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/1935 ci-agent-images PR has been approved and will be merged monday https://github.com/elastic/ci-agent-images/pull/1426#issuecomment-2923339566 |
Crytpo is actually kind of a funny.
@@ -42,7 +42,7 @@ def before_bootstrap_checks(runner) | |||
# ensure Bouncycastle is configured and ready | |||
begin | |||
if Java::org.bouncycastle.crypto.CryptoServicesRegistrar.isInApprovedOnlyMode | |||
accumulator.success "Bouncycastle Crytpo is in `approved-only` mode" | |||
accumulator.success "Bouncycastle Crypto is in `approved-only` mode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Credit @robbavey for eagle eye 🦅
Somehow I lost d8b1980 I just re-introduced and kicked off a fresh test https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/1938 |
Green!
https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/1938 |
x-pack/distributions/internal/observabilitySRE/qa/acceptance/docker/docker-compose.yml
Outdated
Show resolved
Hide resolved
fb096e0
to
721e13b
Compare
Kicked off a build: https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/1947 Locally this is good:
|
Use the same buildkite agent script for setting up a vm based runner as other pipes
Ugh, i had lost a1504c4 again. Re-kicked the buildkite validation https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/1948 |
|
💚 Build Succeeded
History
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question about the elasticsearch.yml
config, feel free to merge if that setting is not required
discovery.type: single-node | ||
http.port: 9200 | ||
network.host: 0.0.0.0 | ||
# Security settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does xpack.security.fips_mode.enabled
need to be set to true
, or is this not required with the the elasticsearch-cloud-ess-fips
docker image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add this as investigation in https://github.com/elastic/ingest-dev/issues/5320 (added a point in google doc to track this down).
Buildkite was green and I added the last question to another ticket #17623 (comment) Will see how this does over the weekend. I expect it to stay green |
elastic#17623) * Add a step to exhaustive tests for observabilitySRE accetpance testing This commit shows the proposed pattern for adding acceptance testing for the observability SRE image. This will run when exhaustive tests run. A new gradle task will hook in to rspec similar to how it is done for the smoke tests. The main difference is that instead of building a container, the latest is pulled from the container registry and run on a fips configured host VM. * WIP: Idea for how to handle multipe container configs for acceptance tests This commit shows the rough structure for how I am planning on handling docker compose networks for acceptance tests. The main idea is to use interpolation in the docker compose file to point to different configuration files for filebeat/logstash/elasticsearch. This is mainly due to the nature of these tests showing behavior when the system is and is not configured properly for FIPS. The breakdown in responsibility is: 1. Gradle handles cert generation (similar to smoke test, this avoids checking in PKI) 2. Rspec handles stopping/starting docker compose and managing environment vars for intperolation in docker compose manifests (different from smoke tests where a single static docker compose is started in gradle) 3. Rspec handles deciding when containers are ready and querying state about data flowing through the system 4. Gradle cleans up certs THis is just a rough sketch, there are still bugs to be worked out but before i get too far in to it I want to get the idea out there. * Add tests describing behavior of LS -> ES with non-fips config This commit adds a test to show that data will not flow from LS to ES when weak non fips config is used. * Use latest ES image This will be handled separately in a separate PR, but taking this commit for now on this branch. * Remove custom entrypoint from new container The latest ES images do not require this workaround. * Take up code review suggestions 1. Remove rogue character from test file causing interpreter failure 2. Split out helpers for docker compose orchestration 3. Only send a single message instead of infinite through to ES * Add full prefix name for new image * Test filebeat -> LS -> ES using fips config As described in elastic/ingest-dev#5471 this commit adds a test for filebeat sending data through logstash to elasticsearch using fips config. * Test LS wont accept input from non fips configured filebeat This test ensures logstash will not accept data from filebeat when using weak tls configuration. See elastic/ingest-dev#5472 * Fix a funny typo. Crytpo is actually kind of a funny. * Ensure we are using the purpose build ES image in testing Similar to elastic#17627 * Ensure JAVA_HOME is set etc Use the same buildkite agent script for setting up a vm based runner as other pipes
elastic#17623) * Add a step to exhaustive tests for observabilitySRE accetpance testing This commit shows the proposed pattern for adding acceptance testing for the observability SRE image. This will run when exhaustive tests run. A new gradle task will hook in to rspec similar to how it is done for the smoke tests. The main difference is that instead of building a container, the latest is pulled from the container registry and run on a fips configured host VM. * WIP: Idea for how to handle multipe container configs for acceptance tests This commit shows the rough structure for how I am planning on handling docker compose networks for acceptance tests. The main idea is to use interpolation in the docker compose file to point to different configuration files for filebeat/logstash/elasticsearch. This is mainly due to the nature of these tests showing behavior when the system is and is not configured properly for FIPS. The breakdown in responsibility is: 1. Gradle handles cert generation (similar to smoke test, this avoids checking in PKI) 2. Rspec handles stopping/starting docker compose and managing environment vars for intperolation in docker compose manifests (different from smoke tests where a single static docker compose is started in gradle) 3. Rspec handles deciding when containers are ready and querying state about data flowing through the system 4. Gradle cleans up certs THis is just a rough sketch, there are still bugs to be worked out but before i get too far in to it I want to get the idea out there. * Add tests describing behavior of LS -> ES with non-fips config This commit adds a test to show that data will not flow from LS to ES when weak non fips config is used. * Use latest ES image This will be handled separately in a separate PR, but taking this commit for now on this branch. * Remove custom entrypoint from new container The latest ES images do not require this workaround. * Take up code review suggestions 1. Remove rogue character from test file causing interpreter failure 2. Split out helpers for docker compose orchestration 3. Only send a single message instead of infinite through to ES * Add full prefix name for new image * Test filebeat -> LS -> ES using fips config As described in elastic/ingest-dev#5471 this commit adds a test for filebeat sending data through logstash to elasticsearch using fips config. * Test LS wont accept input from non fips configured filebeat This test ensures logstash will not accept data from filebeat when using weak tls configuration. See elastic/ingest-dev#5472 * Fix a funny typo. Crytpo is actually kind of a funny. * Ensure we are using the purpose build ES image in testing Similar to elastic#17627 * Ensure JAVA_HOME is set etc Use the same buildkite agent script for setting up a vm based runner as other pipes
elastic#17623) * Add a step to exhaustive tests for observabilitySRE accetpance testing This commit shows the proposed pattern for adding acceptance testing for the observability SRE image. This will run when exhaustive tests run. A new gradle task will hook in to rspec similar to how it is done for the smoke tests. The main difference is that instead of building a container, the latest is pulled from the container registry and run on a fips configured host VM. * WIP: Idea for how to handle multipe container configs for acceptance tests This commit shows the rough structure for how I am planning on handling docker compose networks for acceptance tests. The main idea is to use interpolation in the docker compose file to point to different configuration files for filebeat/logstash/elasticsearch. This is mainly due to the nature of these tests showing behavior when the system is and is not configured properly for FIPS. The breakdown in responsibility is: 1. Gradle handles cert generation (similar to smoke test, this avoids checking in PKI) 2. Rspec handles stopping/starting docker compose and managing environment vars for intperolation in docker compose manifests (different from smoke tests where a single static docker compose is started in gradle) 3. Rspec handles deciding when containers are ready and querying state about data flowing through the system 4. Gradle cleans up certs THis is just a rough sketch, there are still bugs to be worked out but before i get too far in to it I want to get the idea out there. * Add tests describing behavior of LS -> ES with non-fips config This commit adds a test to show that data will not flow from LS to ES when weak non fips config is used. * Use latest ES image This will be handled separately in a separate PR, but taking this commit for now on this branch. * Remove custom entrypoint from new container The latest ES images do not require this workaround. * Take up code review suggestions 1. Remove rogue character from test file causing interpreter failure 2. Split out helpers for docker compose orchestration 3. Only send a single message instead of infinite through to ES * Add full prefix name for new image * Test filebeat -> LS -> ES using fips config As described in elastic/ingest-dev#5471 this commit adds a test for filebeat sending data through logstash to elasticsearch using fips config. * Test LS wont accept input from non fips configured filebeat This test ensures logstash will not accept data from filebeat when using weak tls configuration. See elastic/ingest-dev#5472 * Fix a funny typo. Crytpo is actually kind of a funny. * Ensure we are using the purpose build ES image in testing Similar to elastic#17627 * Ensure JAVA_HOME is set etc Use the same buildkite agent script for setting up a vm based runner as other pipes
…in (#17785) * forward-port observabilitySRE image creation into `main` This is the CLEAN subset of a cherry-pick of the merge-commit from the observabilitySRE feature branch into 8.x in PR #17541 (0b1d299), OMITTING changes to `docker/*` and `rakelib/artifacts.rake` that would conflict due to substantial refactorings on `main`. * forward-port observabilitySRE image creation into `main` (re-implament) This is a forward-port of _functionality_ from the observabilitySRE feature branch into 8.x in PR #17541 (0b1d299), wholly re-implementing the changes in `docker/*` and `rakelib/artifacts.rake` from the 8.x-style docker structure to the refactored structure present on `main`. * Fix pull request pipeline definition for buildkite (#17552) When the fedramp high feature branch was merged into 8.x the PR pipeline accidentally duplicated the top level `steps` key. This was a mistake and is causing issues generating exhaustive test pipeline definition. This commit fixes the bug by ensuring there is a single `steps` key that defines all the steps in the pipeline. * Ensure observabilitySRE image is pushed on DRA staging (#17569) The `artifactDockerObservabilitySRE` gradle task *always* produces a tag with a `SNAPSHOT` postfix. In the staging pipeline we use the shared `qualified-version` script for determining the LS version. That script correctly handles conditionally adding a `SNAPSHOT` postfix which is important for the tagging scheme for pushing to our container registry. Given the intermediate tag produced by the gradle task is never pushed anywhere we can update the build script to ensure the "local" artifact is always referenced with the `SNAPSHOT` postfix. * Use dedicated elasticsearch image for observabilitySRE smoke testing (#17627) * Use dedicated elasticsearch image for observabilitySRE smoke testing The ES team has started publishing a purpose built image for the fedramp high project. Update our smoke test stack to use this container. * Override default entrypoint into elasticsearch container The new image does not provide the stub `/app/elasticsearch.sh` file https://github.com/elastic/elasticsearch/blob/1a1763c591c4c32bf66f0df3bce2040e8f19a1a2/distribution/docker/README.md?plain=1#L16-L19 previously available. This commit overrides the entrypoint to avoid needing that file. See: https://github.com/elastic/elasticsearch/blob/1a1763c591c4c32bf66f0df3bce2040e8f19a1a2/distribution/docker/src/docker/Dockerfile.ess#L38C5-L40C37 * Remove entrypoint workaround due to fix landing upstream * Restore code review changes (#17539) * Comment to clarify why FIPS flag is not needed for smoke tests * Use full versions of docker commands for readability * Simplify grock pattern match The grok pattern is unanchored-by-default, we don't need the leading and trailing wildcards. * Add a step to exhaustive tests for observabilitySRE accetpance testing (#17623) * Add a step to exhaustive tests for observabilitySRE accetpance testing This commit shows the proposed pattern for adding acceptance testing for the observability SRE image. This will run when exhaustive tests run. A new gradle task will hook in to rspec similar to how it is done for the smoke tests. The main difference is that instead of building a container, the latest is pulled from the container registry and run on a fips configured host VM. * WIP: Idea for how to handle multipe container configs for acceptance tests This commit shows the rough structure for how I am planning on handling docker compose networks for acceptance tests. The main idea is to use interpolation in the docker compose file to point to different configuration files for filebeat/logstash/elasticsearch. This is mainly due to the nature of these tests showing behavior when the system is and is not configured properly for FIPS. The breakdown in responsibility is: 1. Gradle handles cert generation (similar to smoke test, this avoids checking in PKI) 2. Rspec handles stopping/starting docker compose and managing environment vars for intperolation in docker compose manifests (different from smoke tests where a single static docker compose is started in gradle) 3. Rspec handles deciding when containers are ready and querying state about data flowing through the system 4. Gradle cleans up certs THis is just a rough sketch, there are still bugs to be worked out but before i get too far in to it I want to get the idea out there. * Add tests describing behavior of LS -> ES with non-fips config This commit adds a test to show that data will not flow from LS to ES when weak non fips config is used. * Use latest ES image This will be handled separately in a separate PR, but taking this commit for now on this branch. * Remove custom entrypoint from new container The latest ES images do not require this workaround. * Take up code review suggestions 1. Remove rogue character from test file causing interpreter failure 2. Split out helpers for docker compose orchestration 3. Only send a single message instead of infinite through to ES * Add full prefix name for new image * Test filebeat -> LS -> ES using fips config As described in elastic/ingest-dev#5471 this commit adds a test for filebeat sending data through logstash to elasticsearch using fips config. * Test LS wont accept input from non fips configured filebeat This test ensures logstash will not accept data from filebeat when using weak tls configuration. See elastic/ingest-dev#5472 * Fix a funny typo. Crytpo is actually kind of a funny. * Ensure we are using the purpose build ES image in testing Similar to #17627 * Ensure JAVA_HOME is set etc Use the same buildkite agent script for setting up a vm based runner as other pipes --------- Co-authored-by: Cas Donoghue <[email protected]>
Release notes
[rn:skip]
What does this PR do?
This commit shows the proposed pattern for adding acceptance testing for the observability SRE image. This will run when exhaustive tests run. A new gradle task will hook in to rspec similar to how it is done for the smoke tests. The main difference is that instead of building a container, the latest is pulled from the container registry and run on a fips configured host VM. Tests have been added showing data flowing under FIPS mode from filebeat through logstash to elasticsearch. Test coverage has also been added to cover what happens when logstash is configured to send or recieve data in non-fips TLS. We show that an error is logged and no data is sent/recieved.
Why is it important/What is the impact to the user?
NA
Checklist
Related issues