Skip to content

Commit fefd7b6

Browse files
authored
Add devstats-data as a submodule (#9)
* Add devstats data as a submodule. * Update README with info about building/data collection. * Update download script.
1 parent 598f5f1 commit fefd7b6

4 files changed

Lines changed: 51 additions & 2 deletions

File tree

.gitmodules

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[submodule "devstats-data"]
2+
path = devstats-data
3+
url = https://github.com/scientific-python/devstats-data.git

README.rst

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,43 @@ Scientific Python Development and Community Statistics
88

99
A static site for collating & publicizing development and community statistics
1010
for Scientific Python projects.
11+
12+
Installation
13+
------------
14+
15+
```python
16+
python -m pip install -r requirements.txt
17+
```
18+
19+
Building the site
20+
-----------------
21+
22+
To get the data used in the analysis:
23+
24+
```bash
25+
git submodule update --init
26+
```
27+
28+
Then the site can be built with:
29+
30+
```bash
31+
cd site
32+
make html
33+
```
34+
35+
Devstats data
36+
-------------
37+
38+
The data used for the officially deployed site is available at
39+
<https://github.com/scientific-python/devstats-data>.
40+
The data repository is included as a submodule in this project, so updating the
41+
submodule is all that is required to ensure you are using the latest data for
42+
the published site.
43+
The ``query.py`` script can be used to collect data for other projects like
44+
so: ``python query.py <repo_owner> <repo_name>`` where ``repo_owner`` and
45+
``repo_name`` are the names of the **org** and **repo** on GitHub, respectively.
46+
For example, to download the latest data for ``pandas``:
47+
48+
```bash
49+
python query.py pandas-dev pandas
50+
```

devstats-data

Submodule devstats-data added at 026908a

query.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -186,22 +186,27 @@ def dump(self, outfile):
186186
@click.argument('repo_name')
187187
def main(repo_owner, repo_name):
188188
"""Download and save issue and pr data for `repo_owner`/`repo_name`."""
189+
# Make sure this works even if the submodule has not been initialized
190+
import os
191+
os.makedirs("devstats-data", exist_ok=True)
192+
# Download issue data
189193
issues = GithubGrabber(
190194
'query_examples/issue_activity_since_date.gql',
191195
'issues',
192196
repo_owner=repo_owner,
193197
repo_name=repo_name,
194198
)
195199
issues.get()
196-
issues.dump(f"_data/{repo_name}_issues.json")
200+
issues.dump(f"devstats-data/{repo_name}_issues.json")
201+
# Download PR data
197202
prs = GithubGrabber(
198203
'query_examples/pr_data_query.gql',
199204
'pullRequests',
200205
repo_owner=repo_owner,
201206
repo_name=repo_name,
202207
)
203208
prs.get()
204-
prs.dump(f"_data/{repo_name}_prs.json")
209+
prs.dump(f"devstats-data/{repo_name}_prs.json")
205210

206211

207212

0 commit comments

Comments
 (0)