Skip to content
This repository was archived by the owner on Nov 5, 2019. It is now read-only.

Commit da08d5f

Browse files
committed
Improved README.
* Made project description less technical. * Linked files in repo. * Cleared up terminology around data repositories.
1 parent 1cf11a5 commit da08d5f

File tree

1 file changed

+25
-9
lines changed

1 file changed

+25
-9
lines changed

README.md

Lines changed: 25 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,19 +5,32 @@
55
[![License](https://img.shields.io/github/license/datatogether/coverage.svg)](./LICENSE)
66
[![Codecov](https://img.shields.io/codecov/c/github/datatogether/coverage.svg?style=flat-square)](https://codecov.io/gh/datatogether/coverage)
77

8-
Visualization to display "archival coverage," starting with epa.gov. This takes a list of urls and associated archiving information, and turns that into a tree of url paths with associated coverage information.
8+
**Coverage** is a project for visualizing the status of digital data archiving efforts across various data repositories run by different initiatives. Its current scope covers data within the epa.gov top-level domain.
99

10-
The output is cached in `cache.json`, because this is a large file, we provide incremental pieces of the cached tree as a web server. To dynamically calculate coverage completion to can work with the `cache.json` file.
10+
This code repo provides the JSON back-end: [`https://api.archivers.co/coverage`](https://api.archivers.co/coverage)
1111

12-
## Current Coverage Sources
12+
The [`datatogether/webapp` repo](https://github.com/datatogether/webapp) provides the visual front-end: [`https://archivers.co/coverage`](https://archivers.co/coverage)
1313

14-
Actual source datasets can be found in the `/repositories` directory. It currently includes the following:
1514

16-
* Archivers 2
17-
* archivers.space
18-
* EDGI Nomination Tool Uncrawlables
19-
* The Internet Archive
20-
* Project Svalbard json-ld crawl
15+
## Current Data Repositories
16+
17+
Actual source datasets can be found in each [`/repositories/*` directory](/repositories). It currently includes the following:
18+
19+
* [Archivers 2](https://alpha.archivers.space/)
20+
* [archivers.space](https://archivers.space/)
21+
* [EDGI Nomination Tool](https://chrome.google.com/webstore/detail/nominationtool/abjpihafglmijnkkoppbookfkkanklok?hl=en) Uncrawlables
22+
* [The Internet Archive](https://archive.org/)
23+
* [Project Svalbard](https://github.com/datproject/svalbard) JSON-LD crawl
24+
25+
Requests for new data repositories are tracked under the [`data-repository`](https://github.com/datatogether/coverage/labels/data-repository) issue label.
26+
27+
28+
## How It Works
29+
30+
It takes a list of urls and associated archiving information, and turns that into a tree of url paths with associated coverage information.
31+
32+
The output is cached in [`cache.json`](cache.json). Because this is a large file, we provide incremental pieces of the cached tree as a web server. To dynamically calculate coverage completion to can work with the `cache.json` file.
33+
2134

2235
## License & Copyright
2336

@@ -32,18 +45,21 @@ PARTICULAR PURPOSE.
3245

3346
See the [`LICENSE`](./LICENSE) file for details.
3447

48+
3549
## Getting Involved
3650

3751
We would love involvement from more people! If you notice any errors or would like to submit changes, please see our [Contributing Guidelines](./github/CONTRIBUTING.md).
3852

3953
We use GitHub issues for [tracking bugs and feature requests](./issues) and Pull Requests (PRs) for [submitting changes](./pulls)
4054

55+
4156
## Installation
4257

4358
The easiest way to get going is to use [docker-compose](https://docs.docker.com/compose/install/). Once you have that:
4459

4560
TODO - finish installation instructions
4661

62+
4763
## Development
4864

4965
Coming soon.

0 commit comments

Comments
 (0)