Skip to content
This repository was archived by the owner on Nov 5, 2019. It is now read-only.

Commit 1cf11a5

Browse files
authored
Merge pull request #17 from datatogether/sprint_prep
Sprint prep
2 parents f143135 + a8d3d2f commit 1cf11a5

File tree

301 files changed

+30622
-7564
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

301 files changed

+30622
-7564
lines changed

.circleci/config.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,15 @@ jobs:
3535
command: go-wrapper download && go-wrapper install && go get -v github.com/jstemmer/go-junit-report
3636
- run:
3737
name: Run tests
38-
command: go test -v -race ./... | tee /tmp/test-reports/datatogether/original.txt ; test ${PIPESTATUS[0]} -eq 0
38+
command: go test -v -race --coverprofile=coverage.txt -covermode=atomic | tee /tmp/test-reports/datatogether/original.txt ; test ${PIPESTATUS[0]} -eq 0
3939
- run:
4040
name: Convert test output to junit-style xml
4141
command: cat /tmp/test-reports/datatogether/original.txt | go-junit-report > /tmp/test-reports/datatogether/junit.xml
4242
- store_test_results:
4343
path: /tmp/test-reports/datatogether/junit.xml
44+
- run:
45+
name: Publish coverage info to codecov.io
46+
command: bash <(curl -s https://codecov.io/bash)
4447
- setup_remote_docker
4548
- run:
4649
name: Install Docker client

.github/CONTRIBUTING.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Contributing Guidelines
2+
3+
We love improvements to our tools! Take a moment to check out our organization-wide [Contributing Guidelines](https://github.com/datatogether/datatogether/blob/master/CONTRIBUTING.md) and [Code of Conduct](https://github.com/datatogether/datatogether/blob/master/CONDUCT.md).

.github/ISSUE_TEMPLATE.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
Hey there, thank you for submitting an issue!
2+
3+
We are trying to keep issues for feature requests and bug reports. Please
4+
complete the following checklist before creating a new one:
5+
6+
- [ ] Is this a **bug report** (if so, is it something you can **debug and fix**?
7+
Send a pull request!)
8+
- [ ] feature request
9+
- [ ] support request => Please do not submit support requests here, ask your question
10+
on [Slack](https://archivers-slack.herokuapp.com/).
11+
12+
---

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
.DS_Store
22
gin-bin
33
coverage
4-
coverage.out
4+
coverage.txt
55
config.*.json
66
*.env

Godeps/Godeps.json

Lines changed: 47 additions & 30 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

README.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Coverage
2+
3+
[![GitHub](https://img.shields.io/badge/project-Data_Together-487b57.svg?style=flat-square)](http://github.com/datatogether)
4+
[![Slack](https://img.shields.io/badge/slack-Archivers-b44e88.svg?style=flat-square)](https://archivers-slack.herokuapp.com/)
5+
[![License](https://img.shields.io/github/license/datatogether/coverage.svg)](./LICENSE)
6+
[![Codecov](https://img.shields.io/codecov/c/github/datatogether/coverage.svg?style=flat-square)](https://codecov.io/gh/datatogether/coverage)
7+
8+
Visualization to display "archival coverage," starting with epa.gov. This takes a list of urls and associated archiving information, and turns that into a tree of url paths with associated coverage information.
9+
10+
The output is cached in `cache.json`, because this is a large file, we provide incremental pieces of the cached tree as a web server. To dynamically calculate coverage completion to can work with the `cache.json` file.
11+
12+
## Current Coverage Sources
13+
14+
Actual source datasets can be found in the `/repositories` directory. It currently includes the following:
15+
16+
* Archivers 2
17+
* archivers.space
18+
* EDGI Nomination Tool Uncrawlables
19+
* The Internet Archive
20+
* Project Svalbard json-ld crawl
21+
22+
## License & Copyright
23+
24+
Copyright (C) 2017 Data Together
25+
This program is free software: you can redistribute it and/or modify it under
26+
the terms of the GNU Affero General Public License as published by the Free Software
27+
Foundation, version 3.0.
28+
29+
This program is distributed in the hope that it will be useful, but WITHOUT ANY
30+
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
31+
PARTICULAR PURPOSE.
32+
33+
See the [`LICENSE`](./LICENSE) file for details.
34+
35+
## Getting Involved
36+
37+
We would love involvement from more people! If you notice any errors or would like to submit changes, please see our [Contributing Guidelines](./github/CONTRIBUTING.md).
38+
39+
We use GitHub issues for [tracking bugs and feature requests](./issues) and Pull Requests (PRs) for [submitting changes](./pulls)
40+
41+
## Installation
42+
43+
The easiest way to get going is to use [docker-compose](https://docs.docker.com/compose/install/). Once you have that:
44+
45+
TODO - finish installation instructions
46+
47+
## Development
48+
49+
Coming soon.

coverage/coverage.go

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
package coverage
22

33
import (
4-
"github.com/datatogether/archive"
4+
"github.com/datatogether/core"
55
"github.com/datatogether/coverage/repositories"
66
"github.com/datatogether/coverage/tree"
77

@@ -86,18 +86,18 @@ func WriteTreeCache(filename string, n *tree.Node) error {
8686
type CoverageGenerator struct {
8787
// Root url.Url
8888
// Depth int
89-
Sources []*archive.Source
89+
Sources []*core.Source
9090
Repos []repositories.CoverageRepository
9191
}
9292

9393
// NewCoverageGenerator creates a CoverageGenerator with the default
9494
// properties
9595
func NewCoverageGenerator(repoIds []string, patterns []string) *CoverageGenerator {
96-
var sources []*archive.Source
96+
var sources []*core.Source
9797
if patterns != nil {
98-
sources := make([]*archive.Source, len(patterns))
98+
sources := make([]*core.Source, len(patterns))
9999
for i, pattern := range patterns {
100-
sources[i] = &archive.Source{
100+
sources[i] = &core.Source{
101101
Url: pattern,
102102
}
103103
}

coverage/coverage_requests.go

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ package coverage
22

33
import (
44
"fmt"
5-
"github.com/datatogether/archive"
5+
"github.com/datatogether/core"
66
"github.com/datatogether/coverage/tree"
77
"net/url"
88
"strings"
@@ -72,9 +72,9 @@ func (p *CoverageSummaryParams) Validate() error {
7272
}
7373

7474
func (CoverageRequests) Summary(p *CoverageSummaryParams, res *Summary) error {
75-
sources := make([]*archive.Source, len(p.Patterns))
75+
sources := make([]*core.Source, len(p.Patterns))
7676
for i, p := range p.Patterns {
77-
sources[i] = &archive.Source{Url: p}
77+
sources[i] = &core.Source{Url: p}
7878
}
7979
summary, err := NewCoverageGenerator(p.RepoIds, p.Patterns).Summary()
8080
if err != nil {

cron.go

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ package main
22

33
import (
44
"database/sql"
5-
"github.com/datatogether/archive"
5+
"github.com/datatogether/core"
66
"github.com/datatogether/coverage/coverage"
77
"time"
88
)
@@ -49,14 +49,14 @@ func calcSourceCoverage(db *sql.DB) error {
4949
cvg := coverage.NewCoverageGenerator(nil, nil)
5050
pageSize := 100
5151

52-
count, err := archive.CountSources(appDB)
52+
count, err := core.CountSources(appDB)
5353
if err != nil {
5454
return err
5555
}
5656

5757
numPages := count / pageSize
5858
for page := 0; page <= numPages; page++ {
59-
sources, err := archive.ListSources(store, pageSize, pageSize*page)
59+
sources, err := core.ListSources(store, pageSize, pageSize*page)
6060
if err != nil {
6161
return err
6262
}
@@ -68,7 +68,7 @@ func calcSourceCoverage(db *sql.DB) error {
6868
}
6969

7070
if s.Stats == nil {
71-
s.Stats = &archive.SourceStats{}
71+
s.Stats = &core.SourceStats{}
7272
}
7373

7474
if s.Stats.ArchivedUrlCount != summary.Archived {
@@ -88,14 +88,14 @@ func calcSourceCoverage(db *sql.DB) error {
8888
func calcPrimerSourceCoverage(db *sql.DB) error {
8989
pageSize := 100
9090

91-
count, err := archive.CountPrimers(appDB)
91+
count, err := core.CountPrimers(appDB)
9292
if err != nil {
9393
return err
9494
}
9595

9696
numPages := int(count) / pageSize
9797
for page := 0; page <= numPages; page++ {
98-
primers, err := archive.ListPrimers(store, pageSize, pageSize*page)
98+
primers, err := core.ListPrimers(store, pageSize, pageSize*page)
9999
if err != nil {
100100
return err
101101
}
@@ -116,7 +116,7 @@ func calcPrimerSourceCoverage(db *sql.DB) error {
116116
}
117117

118118
if p.Stats == nil {
119-
p.Stats = &archive.PrimerStats{}
119+
p.Stats = &core.PrimerStats{}
120120
}
121121

122122
if p.Stats.SourcesUrlCount != urlCount || p.Stats.SourcesArchivedUrlCount != archivedCount {
@@ -133,7 +133,7 @@ func calcPrimerSourceCoverage(db *sql.DB) error {
133133
}
134134

135135
// TODO - finish
136-
func calcPrimerCoverage(db *sql.DB, primers []*archive.Primer) error {
136+
func calcPrimerCoverage(db *sql.DB, primers []*core.Primer) error {
137137
for _, primer := range primers {
138138
if err := primer.ReadSubPrimers(db); err != nil {
139139
return err

0 commit comments

Comments
 (0)