Skip to content

Commit 7adb008

Browse files
authored
Adding support for Athena (#27)
* Added Athena Task and tests * Edited typos in README.md * Updated the README to include Athena Task * Updated LambdaCron policy to include athena:StartQueryExecution * Updated schema.json to include athena * Updated test_validate.py to include test for athena schema * Added athena passing test schema * Updated schema.json to remove unrequired parameters * Updated README.md to reflect change in schema * Updated athena_task.yml to remove unrequired parameters * Updated test_task_runner.py to remove unrequired parameters from the athena task * Updated template.cfn.yml to add S3 permissions necessary for Athena * Updated schema to include optional AWS API parameters * Updated template.cfn.yml to remove extraneous IAM policy statements * Updated Diagram PNG and XML
1 parent 96034ec commit 7adb008

File tree

9 files changed

+135
-14
lines changed

9 files changed

+135
-14
lines changed

README.md

+27-5
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ LambdaCron offers 4 different types of tasks:
2727
* **Batch task**: submit AWS Batch job.
2828
* **HTTP task**: send HTTP requests (GET & POST).
2929

30-
Currently LambdaCron intergrates with HTTP requests and 3 AWS services.
30+
Currently LambdaCron integrates with HTTP requests and 3 AWS services.
3131
It is ready be extended for other services and, in general, it is
3232
ready to reach any service available by an API.
3333

@@ -251,7 +251,7 @@ from a file or a set of tasks in a directory.
251251
Parameters:
252252
253253
* **--task-file (-t)**: File that contains a task definition.
254-
* **--task-directory (-d)**: Directory with a set of files with taqsks definitions.
254+
* **--task-directory (-d)**: Directory with a set of files with tasks definitions.
255255
256256
257257
## Tasks
@@ -330,7 +330,7 @@ The task definition must contain the following keys:
330330
name: 'Enrich new stats every hour'
331331
expression: '0 * * * *'
332332
task:
333-
type: 'bath'
333+
type: 'batch'
334334
jobName: 'enrich-stats'
335335
jobDefinition: 'enrich-stats-definition:1'
336336
jobQueue: 'jobs_high_priority'
@@ -350,20 +350,42 @@ The task definition must contain the following keys:
350350
* **request**: YAML with parameters to send for the selected method using [Requests](http://docs.python-requests.org/en/master/)
351351
352352
``` yaml
353-
name: 'helth check every hour'
353+
name: 'health check every hour'
354354
expression: '0 * * * *'
355355
task:
356356
type: 'http'
357357
method: 'get'
358358
request:
359-
url: 'http://helthcheck.my-domain.com'
359+
url: 'http://healthcheck.my-domain.com'
360360
params:
361361
service: 'lambda'
362362
```
363363
364364
It is a wrapper over [Requests](http://docs.python-requests.org/en/master/).
365365
All HTTP methods will be supported soon.
366366
367+
### Athena task
368+
369+
It executes the SQL query.
370+
The task definition must contain the following keys:
371+
372+
* **type**: *athena*
373+
* **QueryString**: The SQL query statements to be executed (string)
374+
* **ResultConfiguration**: (map)
375+
* **OutputLocation**: the location in S3 where query results are stored (string)
376+
377+
``` yaml
378+
name: 'get high scores every fifteen minutes'
379+
expression: '0 15 * * *'
380+
task:
381+
type: 'athena'
382+
QueryString: 'SELECT Username, HighScore FROM Database.UserTable WHERE HighScore > 1000'
383+
ResultConfiguration:
384+
OutputLocation: 'http://scores.my-app.s3.amazonaws.com'
385+
```
386+
387+
It is a wrapper for [boto3 Athena.Client.start_query_execution](https://boto3.readthedocs.io/en/latest/reference/services/athena.html#Athena.Client.start_query_execution). All parameters for the method can be set in the task definition.
388+
367389
## Frequency
368390
369391
#### Execution time

lambda-cron-diagram.png

2.24 KB
Loading

lambda-cron-diagram.xml

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,2 +1 @@
1-
<?xml version="1.0" encoding="UTF-8"?>
2-
<mxfile userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36" version="6.2.4" editor="www.draw.io" type="device"><diagram name="Page-1">7VrLcqM6EP0aL02BeHqZOJOZxdxbqcpi5q4oGWRbFUCMkGM7X38lQLZlwJZjcKZmkiwMjV59TqulbmlkT9PNVwrz5T8kRskImPFmZD+MALCsicl/hGRbSTynFiwojutCe8EzfkO1UBZb4RgVSkFGSMJwrgojkmUoYooMUkrWarE5SdRec7hADcFzBJOm9AeO2bKSBq65l39DeLGUPVtm/WUGo5cFJaus7m8E7Hn5V31OoWyrLl8sYUzWByL7y8ieUkJY9ZRupigR2ErYqnqPHV9346YoYzoVQFXhFSYrJEdcjottJRYxLJZIFDdH9v2SpQl/tPgjH3ouiqSbhTACA64LYBSMUA5OCLM45OQwPoyQGwd+RXRrFHY4W0UviInajJIXNCUJobyNjGS8qfvm6GuFeHWGNgeiWpuviKSI0S0vIr8GflWltjxbcrbe8+g6lWh5QKHr1dZTW85i1/IePf5QA9gOpt0zmBFJ8xWrwMwQWxP6grOFkcB0FsNBMQwUDIHZguGkiaEPrsfQ+UMwdExbwdCSBnaIod/E0HOux9DrGUOY52GB6CuOUGEUv4rw1wqJlm8G3sTRws7uYQ77PWOHuQ+k3PBCMg/ZkpteYSwZy8OcEkYiktwQRVs2cQgjaIFxcj2MwXkYURbfiaWav0UJLAocqXB2IoBiZfFu6n+on9ni6msZRQlkfGVSdw8tStc9PBFO5h5eYE5UeOUeRzZRkBWNUF3rcA0+0xAIjhpikC4QazRUcrBTW4uWyUfSItW1OtTVpO0Chqy+GOoacv8MyR3wx1IEbkYR6IuiriEPQJHGpvkcRZwZuv1ZryHly3/ixXDl6xOimI8JUVlmg9lPWZk/HxTnb8elOy2gglPZrlbAKLuv38TD7sLImlPnmFNt43COGrKHMw6NIOBGxgGsnq3Du9CLeCrqNhjMi7hqT+CYX21D6RryAIaiEencyFD8Sc+G4l9oKP4R6vZghnLc0+S9htI15P4NxdUwFBEU4Agm3+EMJU+kwAyTjH+aEcZIqhqNLHuX4IUow0jeCEJGPJYq//iXOU6SA/l9+d8W7rytKDKKiMdGq0SYUC+pCFdd1i2/GcPIsO/QNBzQbRraoaDGVkzk+nJ9PXcZSziTLZin9ZfKSc/mBQ39vZapEdjX6++9X/8ddFcDMG6mTew2V9CDvq5G0NoyjzrnXCI+3O8SwgdzqE4Jn56LMvEMLpmdc1f8V7OzSiwLP65kLI7nLSaFb2BOS2HEJFqlJYktc9fr4rSTO7dBnTQCJd1wwotrU6cR2Eq9cVom/8+7xD4ILju7K/LqkELAD+XLHG8EJ/f1eB5EWohDcCeUBo9RnFklK3OcxYiKPCaXxpBB/lOyxX+XcF0/2QIr3zRNH4zF17EdjLlpoPE2TcYWCIw8W1xF6W5npTUd++DUs85zOrj7cdTVp7kvGcoZeRpB5t/sjC7eSdzOGXkaIeCnM7qG0g9wRhqb8MGdkaV6o5ZwejBv5H56o1PeyOmg+HfwRhqngZ/e6BpKP8AbaZxSDu6NpCJb9fUW3ugzUDvpjfwOiruD7GZSZTB39Bmpvcsd6XN6e3fka0Rqw59Zmqo/GvDM8qinKw7+O4bcfxJZXs36o6YdODvtYArf+BSD62JcMJRFOBHS8rbm47S+GRb+u7sWFkYk34Z3P57DB0z5IMJptSTu5mV3NrxzVW2YdPeCGhzlekEz9xm0XLuTsqumcB/HlhcdUe/PrgwrAJrnV60u4V0nVOC0M3HVW6TWpJl27+voUu0JOEeX2HSdiWPZ7UO+2pnw1/2F5Kr4/ta3/eV/</diagram></mxfile>
1+
<mxfile userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36" version="6.9.5" editor="www.draw.io" type="device"><diagram name="Page-1" id="9c1861c3-233a-52e1-f412-60270fc0d40d">7VvLcuI4FP0alrhs+ckyIZ2excxUqrLonpVL2AJUsS23LALk60fCFiA/QASbpLoDC+xrWZbOufdYVxIje5puvlOYL/8hMUpGwIw3I/thBAAw/YD/CMu2tFiWY5aWBcVxZTsYnvEbqoyy2ArHqFAKMkIShnPVGJEsQxFTbJBSslaLzUmiPjWHC9QwPEcwaVp/4JgtS2vgmgf7XwgvlvLJllldmcHoZUHJKqueNwL2fPcpL6dQ1lWVL5YwJusjk/1tZE8pIaw8SjdTlAhwJWzlfY8dV/ftpihjOjeA8oZXmKyQbPGuXWwrsYhhsUSiuDmy75csTfihxQ9503NRJN0shBcYcF0Ao2CEcnBCmMUhJ4fxZoTcO/ArolujsMPZKnpBTNzNKHlBU5IQyuvISMarum+2vuoQv52hzZGp6s13RFLE6JYXkVcDv7ylcj1bcrY+8Og6pWl5RKHrVd5Tec5iX/MBPX5QAdgOpt0zmBFJ8xUrwcwQWxP6grOFkcB0FsNBMQwUDIHZguGkiaEPrsfQ+U0wdMyJgqEF3CaGfhNDz7keQ69nDGGehwWirzhChVH8KsJfKyRqHg48Cyjg+S3+14Kd3UMM+z1jh7kGUu54IZmHbMldrzCWjOVhTgkjEUkGRdFVpdDzmzCCFhgn18MYnIcRZfGdeFXzsyiBRYEjFc5OBFCsvLyb/T/un9ki9ZWNogQy/mZSRw8tna6e8EQ4mQd4QS3CbTnGkVUUZEUjVN11/A4+UxEIahUxSBeINSracbDvthYtk4+kRXbX6uiuJm0XMGT1xVBXk/tnSI6AP5YicDOKQF8UdTV5AIo0Bs3nKOLM0O3P6h2yO/lPnBiuPH1CFPM2ISrLbDD7KW/mx0fF+Vm9dKcHlHAqw9USGGX09UkU1pqYCqdOnVNt53BqFdnDOYdGEnAj5wBWz97hXaginoq6DQZTEVd9Eqjzq+0oXU0ewFE0Mp0bOYo/6dlR/Asdxa+hbg/mKPUnTd7rKF1N7t9RXA1HEUkBjmDyN5yh5IkUmGGS8UszwhhJVaeRZe8SvBBlGMkbSciIJwu7D78yx0lyZL/ffdvSnbcVRUYR8dxolQgX6mUqwq2l0b7VcA2Z9h27hgO6XUM7FdQYiom5vly/n/sZSziTNZin+y87J5XNCxr991pCI7Cv77/3/v7vobsagLHX5LtNCnror6uRtLbEUWfMJeLC/X5C+CiGqinh07EoJ57BJdE5d8W3jM5yYlnouDJjUY9bTArfwJyWwohJtEp3JLbErtfFaSd3zRkv6QTKdMMJFdemTiOxlf3G6W7y/7wk9kHw7mF3RV4uUgj4oTyZ443g5L5qz4OYFuIQ3IlOg8cozqwdK3OcxYiKeUxujSGD/GfHFv9dwnV1ZAusfNM0fTAWV8d2MOaugcbbNBlbIDDybHEVpfuRlVY49sGpZ53ndHD5cdS3T3NcMpQYeRpJ5p8sRhePJG4nRp5GCvglRtdQ+gFipDEIH1yMLFWNWtLpwdTI/VKjU2rkdFD8GdRIYzXwS42uofQD1EhjlXJwNZId2aqnt1Cjr0TtpBr5HRR3J9nNSZXB5OgrU3uXHOlzens58jUyteHXLE1VjwZcs6w96YqF/44m9z+JLLdm/VZhB86GHUzhGw8xuC7GBUNZhBNh3e3WfJxWO8PCf/fbwsKI5Nvw7sdz+IApb0Q4LV+J+7jsng3v9F99FbYntuIOVjBpeLA0KXO9PezX8ftYtrxoifqwdmVYAdBcv+pE+dyCFLhQTFx1F6k1aU6797V0qT4JOLVNbLpi4lh2e5MHEJN+N2naBmRLlMFmdD26gWs73M7Lxphz3rpJrmUo2/CQE0FX2zfSssDke4bb5PpgvSrwNDLrT7GZpBY674pC53TQBR3v1eGDTg6vLg46oO6UdvyaFr876Pjp4V8AZfHDfy3sb/8D</diagram></mxfile>

lambda_cron/aws/lib/task_runner.py

+13
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,9 @@ def get_http_task_runner(self, task):
4545
def get_batch_task_runner(self, task):
4646
return BatchJobTask(task)
4747

48+
def get_athena_task_runner(self, task):
49+
return AthenaQueryTask(task)
50+
4851

4952
class Task:
5053

@@ -97,3 +100,13 @@ def get_batch_client(self):
97100
def run(self):
98101
self.task.pop('type')
99102
self.get_batch_client().submit_job(**self.task)
103+
104+
105+
class AthenaQueryTask(Task):
106+
107+
def get_athena_client(self):
108+
return boto3.client('athena')
109+
110+
def run(self):
111+
self.task.pop('type')
112+
self.get_athena_client().start_query_execution(**self.task)

lambda_cron/schema.json

+17
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,23 @@
4040
},
4141
"required": ["type", "jobName", "jobQueue", "jobDefinition"]
4242
},
43+
{
44+
"properties": {
45+
"type": { "type": "string", "enum": ["athena"] },
46+
"QueryString": { "type": "string" },
47+
"ClientRequestToken": { "type": "string" },
48+
"QueryExecutionContext": { "type": "object" },
49+
"ResultConfiguration": {
50+
"type": "object",
51+
"properties": {
52+
"OutputLocation": { "type": "string" },
53+
"EncryptionConfiguration": { "type": "object" }
54+
},
55+
"required": ["OutputLocation"]
56+
}
57+
},
58+
"required": ["type", "QueryString", "ResultConfiguration"]
59+
},
4360
{
4461
"properties": {
4562
"type": { "type": "string", "enum": ["http"] },

lambda_cron/template.cfn.yml

+8-6
Original file line numberDiff line numberDiff line change
@@ -72,12 +72,10 @@ Resources:
7272
Resource: arn:aws:logs:*:*:*
7373
- Effect: Allow
7474
Action:
75-
- s3:ListBucket
76-
Resource: !Sub arn:aws:s3:::${Bucket}
77-
- Effect: Allow
78-
Action:
79-
- s3:GetObject
80-
Resource: !Sub arn:aws:s3:::${Bucket}/tasks/*
75+
- s3:Get*
76+
- s3:List*
77+
- s3:PutObject
78+
Resource: ['*']
8179
- Effect: Allow
8280
Action:
8381
- sqs:SendMessage
@@ -91,6 +89,10 @@ Resources:
9189
Action:
9290
- batch:SubmitJob
9391
Resource: ['*']
92+
- Effect: Allow
93+
Action:
94+
- athena:StartQueryExecution
95+
Resource: ['*']
9496

9597
LambdaCronHourlyEvent:
9698
Type: AWS::Events::Rule

tests/aws/test_task_runner.py

+53-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,8 @@
1515
import pytest
1616
import json
1717
from mock import patch
18-
from lambda_cron.aws.lib.task_runner import TaskRunner, QueueTask, InvokeLambdaTask, HttpTask, BatchJobTask
18+
from lambda_cron.aws.lib.task_runner import TaskRunner, QueueTask, InvokeLambdaTask, HttpTask, BatchJobTask, \
19+
AthenaQueryTask
1920
from lambda_cron.aws.lib.cron_checker import CronChecker
2021

2122

@@ -377,3 +378,54 @@ def test_batch_should_run_basic(get_batch_client_mock, batch_client_spy, cron_ch
377378
assert batch_client_spy.parameters['jobQueue'] == 'testing-batch-job-queue'
378379
assert 'parameters' in batch_client_spy.parameters
379380
assert batch_client_spy.parameters['parameters'] == BATCH_TASK_BODY['parameters']
381+
382+
383+
class AthenaClientSpy:
384+
def __init__(self):
385+
self.parameters = None
386+
self.calls = 0
387+
388+
def start_query_execution(self, **kwargs):
389+
self.parameters = kwargs
390+
self.calls += 1
391+
392+
393+
@pytest.fixture(scope="function")
394+
def athena_client_spy():
395+
return AthenaClientSpy()
396+
397+
398+
ATHENA_TASK_BODY =\
399+
{
400+
'type': 'athena',
401+
'QueryString': 'SELECT * FROM testing',
402+
'ResultConfiguration':
403+
{
404+
'OutputLocation': 'bucketname.s3.aws.com/foo/bar/',
405+
}
406+
}
407+
408+
409+
@pytest.fixture(scope="function")
410+
def athena_task_definition():
411+
return {
412+
'name': 'Test task',
413+
'expression': '0 11 * * *',
414+
'task': dict(ATHENA_TASK_BODY)
415+
}
416+
417+
418+
@patch.object(AthenaQueryTask, 'get_athena_client')
419+
def test_athena_should_run_basic(get_athena_client_mock, athena_client_spy, cron_checker, athena_task_definition):
420+
get_athena_client_mock.return_value = athena_client_spy
421+
422+
task_runner = TaskRunner(cron_checker)
423+
task_runner.run(athena_task_definition)
424+
425+
assert athena_client_spy.calls == 1
426+
assert 'QueryString' in athena_client_spy.parameters
427+
assert athena_client_spy.parameters['QueryString'] == 'SELECT * FROM testing'
428+
assert 'ResultConfiguration' in athena_client_spy.parameters
429+
assert type(athena_client_spy.parameters['ResultConfiguration']) == dict
430+
assert sorted(athena_client_spy.parameters['ResultConfiguration'].keys()) == ['OutputLocation']
431+
assert athena_client_spy.parameters['ResultConfiguration']['OutputLocation'] == 'bucketname.s3.aws.com/foo/bar/'

tests/cli/command/test_validate.py

+8
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,14 @@ def test_validate_command_lambda_task():
5252
validate_command.run()
5353

5454

55+
def test_validate_command_athena_task():
56+
cli_arguments = Namespace()
57+
cli_arguments.task_file = get_test_task_path('valid/athena_task.yml')
58+
cli_arguments.task_directory = None
59+
validate_command = ValidateCommand(CliConfig('test'), cli_arguments)
60+
validate_command.run()
61+
62+
5563
def test_validate_command_with_error():
5664
cli_arguments = Namespace()
5765
cli_arguments.task_file = get_test_task_path('invalid/invalid_task.yml')
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
name: 'test athena task'
2+
expression: '0 11 * * *'
3+
task:
4+
type: 'athena'
5+
QueryString: 'SELECT * FROM TestTable'
6+
ResultConfiguration:
7+
OutputLocation: 'test-bucket'
8+

0 commit comments

Comments
 (0)