Skip to content

Commit

Permalink
Add deliverables for sprint 6
Browse files Browse the repository at this point in the history
Signed-off-by: Minh Khue Tran <[email protected]>
  • Loading branch information
Minh Khue Tran committed Nov 26, 2024
1 parent 673afa8 commit 2c97eff
Show file tree
Hide file tree
Showing 6 changed files with 151 additions and 0 deletions.
Binary file added Deliverables/sprint-06/feature-board.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
45 changes: 45 additions & 0 deletions Deliverables/sprint-06/feature-board.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
Title URL Assignees Status Estimated size Real size Labels Sprint
Prepare RTDIP demo https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/43 Timm638 In Progress 8
Finish integrating ARIMA functionality of statsmodels into RTDIP https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/40 Timm638 In Progress 5 component Sprint 5
Data Binning https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/46 FelipeTrost In Progress 5 component Sprint 5
Homework - user/desing/build documentation https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/56 FelipeTrost In Progress 5 documentation Sprint 6
Value range validation: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/67 mollle In Progress refactoring
Store monitoring outputs in a standardized format https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/26 dh1542 Feature Archive 13 enhancement
Reduce number of parameters needed to use ArimaPrediction effectively https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/41 chris-1187 Awaiting Review 8 component Sprint 5
Advanced Duplicate Detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/30 mollle Awaiting Review 2 component
One-Hot Encoding https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/45 kristen149 Awaiting Review 3 component Sprint 5
Interval Filtering not working for EventTime column of type 'datetime' https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/53 dh1542 Awaiting Review 2 bug Sprint 6
Time Series prediction using ARIMA https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/29 Timm638 Feature Archive 13 8 component Sprint 4
Flatline detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/44 mollle Feature Archive 2 2 component Sprint 5
Validation of value ranges https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/31 mollle Feature Archive 3 3 component Sprint 5
Missing value imputation https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/36 chris-1187 Feature Archive 13 13 component Sprint 5
Time series prediction with linear regression https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/28 FelipeTrost, kristen149 Feature Archive 8 8 component Sprint 5
Normalization of Data https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/18 kristen149, Timm638 Feature Archive 8 8 component
Clean data based on Interval/Pattern https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/22 dh1542 Feature Archive 8 8 component
Create a test pipeline to run during release https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/24 FelipeTrost Feature Archive 5 1
[Component] Identify missing data https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/2 mollle Feature Archive 8 8 enhancement
Explore the test data and brainstorm RTDIP component ideas https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/11 chris-1187 Feature Archive 5 5
[Component] Anomaly detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/6 FelipeTrost Feature Archive 3 8 enhancement
[sprint-02] Create software architecture diagram https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/10 dh1542, Timm638 Feature Archive 3 5
[sprint-02] Create software bill of materials https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/9 kristen149 Feature Archive 1 1
Fix broken virtual environment https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/8 dh1542, Timm638 Feature Archive 3 3 bug
[Component] Duplicate detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/4 chris-1187, dh1542 Feature Archive 8 8 enhancement
[Component] Outlier detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/3 FelipeTrost Feature Archive duplicate
Set up a development environment https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/1 Feature Archive good first issue
Dimensionality Reduction https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/17 Product Backlog component
Unified input data validation https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/60 Product Backlog refactoring
Duplicate detection: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/61 Product Backlog refactoring
Interval filtering: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/62 Product Backlog refactoring
Anomaly detection: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/63 Product Backlog refactoring
De/normalization: refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/64 Product Backlog refactoring
ARIMA: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/65 Product Backlog refactoring
Linear regression: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/66 Product Backlog refactoring
Flatline detection: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/68 Product Backlog refactoring
Missing data detection: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/69 Product Backlog refactoring
Demo pipeline of multiple components Product Backlog
Alternative Preprocessing Methods https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/19 Feature Archive component
Please adopt the Deliverables folder structure from https://github.com/amosproj/amos202Xss0Y-projname to your repo / branch https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/7 Feature Archive documentation
[Component] Trend Identification https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/20 Feature Archive
Define clear acceptance criteria for components https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/16 Feature Archive
Interval Screening and Missing Entry Insertion https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/47 Feature Archive
[Component] Data Format https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/21 Feature Archive
Binary file added Deliverables/sprint-06/imp-squared-backlog.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 17 additions & 0 deletions Deliverables/sprint-06/imp-squared-backlog.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Title Assignees Status
Getting code reviews by shell In Progress
Make sure everyone can run the product (for example via readme doc) In Progress
Make sure the team meeting ends at 14:00 In Progress
Long term planning such as: (notes in description) In Progress
Assign non backlog homework tasks after team meeting In Progress
SDs- agree to datadype passing to ensure consistent function of compnents In Progress
Better communication regarding PR reviews - contact via Slack In Progress
Define unit test to a more detailed granularity In Progress
Slack workspace (Avi) Done
SD Meeting Done
Homework no assigned clearly - now assigned Done
Figure out pipeline bug - for everyone Done
Get to know expectations and requirements from Industry Partner Done
Discuss with industry partner an optimal time for the meeting to take place Done
No expericence in ML Done
Coordinated PR Reviews Done
Binary file added Deliverables/sprint-06/planning-document.pdf
Binary file not shown.
89 changes: 89 additions & 0 deletions Deliverables/sprint-06/rtdip-sdk-sbom.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Software Bill of Materials (SBOM)

## Project Name: rtdip-sdk
## Version: [RTDIP_SDK_1]
## Date: [26.11.2024]
## License: Apache License, Version 2.0

### Overview
This SBOM lists all required and optional dependencies for the `rtdip-sdk` project, including their versions and licenses.

### Components


| Field | Name | Version Range | Supplier | License | Comment |
|-------|--------------------------------|----------------------|-----------------------------|--------------------|--------------------------|
| 1 | databricks-sql-connector | >=3.1.0,<4.0.0 | Databricks, Inc. | Apache 2.0 | SQL connector for Databricks |
| 2 | azure-identity | >=1.12.0,<2.0.0 | Microsoft | MIT | Identity management for Azure |
| 3 | pandas | >=1.5.2,<2.2.0 | The Pandas Development Team | BSD 3-Clause | Data manipulation library |
| 4 | jinja2 | >=3.1.4,<4.0.0 | Jinja2 Team | BSD 3-Clause | Template engine for Python |
| 5 | importlib_metadata | >=7.0.0 | PyPa | MIT | Metadata for Python packages |
| 6 | semver | >=3.0.0,<4.0.0 | Mikhail Korobeynikov | MIT | Semantic versioning library |
| 7 | xlrd | >=2.0.1,<3.0.0 | Python Software Foundation | MIT | Library for reading Excel files |
| 8 | grpcio | >=1.48.1 | Google LLC | Apache 2.0 | gRPC library for Python |
| 9 | grpcio-status | >=1.48.1 | Google LLC | Apache 2.0 | gRPC status library |
| 10 | googleapis-common-protos | >=1.56.4 | Google LLC | Apache 2.0 | Common protobufs for Google APIs |
| 11 | langchain | >=0.2.0,<0.3.0 | Harrison Chase | MIT | Framework for LLMs |
| 12 | langchain-community | >=0.2.0,<0.3.0 | Harrison Chase | MIT | Community contributions to LangChain |
| 13 | openai | >=1.13.3,<2.0.0 | OpenAI | MIT | OpenAI API client |
| 14 | pydantic | >=2.6.0,<3.0.0 | Samuel Colvin | MIT | Data validation library |
| 15 | pyspark | >=3.3.0,<3.6.0 | The Apache Software Foundation | Apache 2.0 | Spark library for Python |
| 16 | delta-spark | >=2.2.0,<3.3.0 | Databricks, Inc. | Apache 2.0 | Delta Lake integration with Spark |
| 17 | dependency-injector | >=4.41.0,<5.0.0 | Paul Ganssle | MIT | Dependency injection framework |
| 18 | databricks-sdk | >=0.20.0,<1.0.0 | Databricks, Inc. | Apache 2.0 | SDK for Databricks services |
| 19 | azure-storage-file-datalake | >=12.12.0,<13.0.0 | Microsoft | MIT | Azure Data Lake Storage client |
| 20 | azure-mgmt-storage | >=21.0.0 | Microsoft | MIT | Azure Storage management client |
| 21 | azure-mgmt-eventgrid | >=10.2.0 | Microsoft | MIT | Azure Event Grid management client |
| 22 | boto3 | >=1.28.2,<2.0.0 | Amazon Web Services | Apache 2.0 | AWS SDK for Python |
| 23 | hvac | >=1.1.1 | HashiCorp | MPL 2.0 | HashiCorp Vault client |
| 24 | azure-keyvault-secrets | >=4.7.0,<5.0.0 | Microsoft | MIT | Azure Key Vault secrets management |
| 25 | web3 | >=6.18.0,<7.0.0 | N/A | MIT | Ethereum blockchain library |
| 26 | polars[deltalake] | >=0.18.8,<1.0.0 | N/A | MIT | DataFrame library with Delta Lake support |
| 27 | delta-sharing | >=1.0.0,<1.1.0 | N/A | Apache 2.0 | Delta Sharing library |
| 28 | xarray | >=2023.1.0,<2023.8.0 | N/A | BSD 3-Clause | N-dimensional array library |
| 29 | ecmwf-api-client | >=1.6.3,<2.0.0 | N/A | Apache 2.0 | ECMWF API client |
| 30 | netCDF4 | >=1.6.4,<2.0.0 | N/A | BSD 3-Clause | NetCDF file reading/writing |
| 31 | joblib | >=1.3.2,<2.0.0 | N/A | BSD 3-Clause | Lightweight pipelining library |
| 32 | sqlparams | >=5.1.0,<6.0.0 | N/A | MIT | SQL query parameters library |
| 33 | entsoe-py | >=0.5.10,<1.0.0 | N/A | MIT | ENTSOE API client |
| 34 | pytest | ==7.4.0 | N/A | MIT | Testing framework |
| 35 | pytest-mock | ==3.11.1 | N/A | MIT | Mocking for pytest |
| 36 | pytest-cov | ==4.1.0 | N/A | MIT | Coverage reporting for pytest |
| 37 | pylint | ==2.17.4 | N/A | GPL 2.0 | Static code analysis for Python |
| 38 | pip | >=23.1.2 | N/A | MIT | Python package installer |
| 39 | turbodbc | ==4.11.0 | N/A | MIT | ODBC interface for Python |
| 40 | numpy | >=1.23.4,<2.0.0 | NumPy Developers | BSD 3-Clause | Numerical computing library |
| 41 | oauthlib | >=3.2.2,<4.0.0 | N/A | MIT | OAuth library |
| 42 | cryptography | >=38.0.3 | N/A | MIT | Cryptography library |
| 43 | fastapi | >=0.110.0,<1.0.0 | Sebastián Ramírez | MIT | Fast web framework |
| 44 | httpx | >=0.24.1,<1.0.0 | N/A | MIT | HTTP client for Python |
| 45 | openjdk | >=11.0.15,<12.0.0 | N/A | N/A | OpenJDK Java runtime |
| 46 | mkdocs-material | ==9.5.20 | N/A | MIT | Material theme for MkDocs |
| 47 | mkdocs-material-extensions | ==1.3.1 | N/A | MIT | Extensions for MkDocs |
| 48 | mkdocstrings | ==0.25.0 | N/A | MIT | Documentation generation |
| 49 | mkdocstrings-python | ==1.10.8 | N/A | MIT | Python support for mkdocstrings |
| 50 | mkdocs-macros-plugin | ==1.0.1 | N/A | MIT | Macros for MkDocs |
| 51 | mkdocs-autorefs | >=1.0.0,<1.1.0 | N/A | MIT | Automatic references for MkDocs |
| 52 | pygments | ==2.16.1 | N/A | BSD 2-Clause | Syntax highlighting library |
| 53 | pymdown-extensions | ==10.8.1 | N/A | MIT | Extensions for Markdown |
| 54 | pygithub | >=1.59.0 | N/A | MIT | GitHub API client |
| 55 | pyjwt | >=2.8.0,<3.0.0 | N/A | MIT | JSON Web |
| 56 | conda | >=24.9.2 | N/A | BSD 3-Clause | Package installer |
| 57 | python | >=3.9,<3.12 | Python Software Foundation | PSF | Python programming language |
| 58 | pyodbc | >=4.0.39,<5.0.0 | N/A | MIT | ODBC library for Python |
| 59 | twine | ==4.0.2 | PyPA | Apache 2.0 | Python package publishing tool |
| 60 | black | >=24.1.0 | Python Software Foundation | MIT | Code formatter for Python |
| 61 | great-expectations | >=0.18.8,<1.0.0 | N/A | Apache 2.0 | Data validation tool |
| 62 | azure-functions | >=1.15.0,<2.0.0 | Microsoft | MIT | Functions for Azure services |
| 63 | build | ==0.10.0 | PyPA | MIT | Python package build tool |
| 64 | deltalake | >=0.10.1,<1.0.0 | Delta, Inc. | Apache 2.0 | Delta Lake interaction for Python |
| 65 | trio | >=0.22.1 | Python Software Foundation | MIT | Async library for concurrency |
| 66 | eth-typing | >=4.2.3,<5.0.0 | Ethereum Foundation | MIT | Ethereum types library |
| 67 | moto[s3] | >=5.0.16,<6.0.0 | Spulec | Apache 2.0 | Mock library for AWS S3 |
| 68 | pyarrow | >=14.0.1,<17.0.0 | Apache Arrow | Apache 2.0 | Columnar data storage and processing |
| 69 | statsmodels | >=0.14.1,<0.15.0 | Open source | BSD | Statistical models in Python |
| 70 | pmdarima | >=2.0.4 | Open source | MIT | Auto-ARIMA and time series forecasting |

### Summary
- **Total Components**: 70
- **Last Updated**: [26.11.2024]

0 comments on commit 2c97eff

Please sign in to comment.