Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 54 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,61 +1,54 @@
# Augur NEW Release v0.42.0

[![standard-readme compliant](https://img.shields.io/badge/standard--readme-OK-green.svg?style=flat-square)](https://github.com/RichardLitt/standard-readme)

[![Build Docker images](https://github.com/chaoss/augur/actions/workflows/build_docker.yml/badge.svg)](https://github.com/chaoss/augur/actions/workflows/build_docker.yml)

[![Hits-of-Code](https://hitsofcode.com/github/chaoss/augur?branch=main)](https://hitsofcode.com/github/chaoss/augur/view?branch=main)


[![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/2788/badge)](https://bestpractices.coreinfrastructure.org/projects/2788)

## What is Augur?

Augur is a software suite for collecting and measuring structured data
about [free](https://www.fsf.org/about/) and [open-source](https://opensource.org/docs/osd) software (FOSS) communities.

We gather trace data for a group of repositories, normalize it into our data model, and provide a variety of metrics about said data. The structure of our data model enables us to synthesize data across various platforms to provide meaningful context for meaningful questions about the way these communities evolve.
Augur’s main focus is to measure the overall health and sustainability of open source projects, as these types of projects are system critical for nearly every software organization or company. We do this by gathering data about project repositories and normalizing that into our data model to provide useful metrics about your project’s health. For example, one of our metrics is Burstiness. Burstiness – how are short timeframes of intense activity, followed by a corresponding return to a typical pattern of activity, observed in a project?

This can paint a picture of a project’s focus and gain insight into the potential stability of a project and how its typical cycle of updates occurs.

We are a [CHAOSS](https://chaoss.community) project, and many of our
metrics are implementations of the metrics defined by our awesome community. You can find a full list of them [here](https://chaoss.community/metrics/).

For more information on [how to get involved on the CHAOSS website](https://chaoss.community/participate/).

## Collecting Data

Augur supports Python3.6 through Python3.9 on all platforms. Python3.10 and above do not yet work because of machine learning worker dependencies. On OSX, you can create a Python 3.9 environment this way: `python3.9 -m venv path/to/venv`.

Augur's main focus is to measure the overall health and sustainability of open source projects.

Augur collects more data about open source software projects than any other available software. Augur's main focus is to measure the overall health and sustainability of open source projects.
One of Augur's core tenets is a desire to openly gather data that people can trust, and then provide useful and well-defined metrics that help give important context to the larger stories being told by that data. We do this in a variety of ways, one of which is doing all our own data collection in house. We currently collect data from a few main sources:

1. Raw Git commit logs (commits, contributors)
2. GitHub's API (issues, pull requests, contributors, releases, repository metadata)
3. The Linux Foundation's [Core Infrastructure Initiative](https://www.coreinfrastructure.org/) API (repository metadata)
4. [Succinct Code Counter](https://github.com/boyter/scc), a blazingly fast Sloc, Cloc, and Code tool that also performs COCOMO calculations

This data is collected by dedicated data collection workers controlled by Augur, each of which is responsible for querying some subset of these data sources. We are also hard at work building workers for new data sources. If you have an idea for a new one, [please tell us](https://github.com/chaoss/augur/issues/new?template=feature_request.md) - we'd love your input!


## Getting Started

If you're interested in collecting data with our tool, the Augur team has worked hard to develop a detailed guide to get started with our project which can be found [in our documentation](https://oss-augur.readthedocs.io/en/main/getting-started/toc.html).

If you're looking to contribute to Augur's code, you can find installation instructions, development guides, architecture references (coming soon), best practices and more in our [developer documentation](https://oss-augur.readthedocs.io/en/main/development-guide/toc.html). Please know that while it's still rather sparse right now,
but we are actively adding to it all the time. If you get stuck, please feel free to [ask for help](https://github.com/chaoss/augur/issues/new)!

## Contributing

To contribute to Augur, please follow the guidelines found in our [CONTRIBUTING.md](CONTRIBUTING.md) and our [Code of Conduct](CODE_OF_CONDUCT.md). Augur is a welcoming community that is open to all, regardless if you're working on your 1000th contribution to open source or your 1st. We strongly believe that much of what makes open source so great is the incredible communities it brings together, so we invite you to join us!

## License, Copyright, and Funding

Copyright © 2112 University of Nebraska at Omaha, University of Missouri and the CHAOSS Project.

Augur is free software: you can redistribute it and/or modify it under the terms of the MIT License as published by the Open Source Initiative. See the [LICENSE](LICENSE) file for more details.

This work has been funded through the Alfred P. Sloan Foundation, Mozilla, The Reynolds Journalism Institute, contributions from VMWare, Red Hat Software, Grace Hopper's Open Source Day, GitHub, Microsoft, Twitter, Adobe, the Gluster Project, Open Source Summit (NA/Europe), and the Linux Foundation Compliance Summit. Significant design contributors include Kate Stewart, Dawn Foster, Duane O'Brien, Remy Decausemaker, others omitted due to the memory limitations of project maintainers, and 15 Google Summer of Code Students.
### chenkx找到的启动augur方法

> 提一句
> 原repo推出了v0.40版本,但是默认的main分支还是老版本。
> 新版本应该在`augur-new`相关的分支上。

设置环境变量
```shell
export AUGUR_DB_PORT=5434
export AUGUR_GITHUB_USERNAME=<your github username>
export AUGUR_GITHUB_API_KEY=<your github token>
export AUGUR_GITLAB_USERNAME=<your github username>
export AUGUR_GITLAB_API_KEY=<your github token>
```

#### 使用docker的postgres数据库
```shell
docker compose up
```
#### 使用本地的postgres数据库

##### 搭建、配置本地postgres数据库
设置`postgresql.conf`
```conf
listen_addresses = '*'
```
使用docker连接数据库,还需要在`pg_hba.conf`后面添加
```conf
# TYPE DATABASE USER ADDRESS METHOD
host all all 0.0.0.0/0 md5
```

##### 创建augur数据库以及augur账户,并配置权限
```sql
CREATE DATABASE augur;
CREATE USER augur WITH ENCRYPTED PASSWORD 'augur';
-- GRANT ALL PRIVILEGES ON DATABASE augur TO augur;
ALTER DATABASE augur OWNER TO augur;
```
##### 还要多设置一个环境变量

wsl中的ip地址可以通过`ip addr | grep eth0`输出中inet后面的地址得到。
```shell
# if in wsl:
export AUGUR_DB=postgresql+psycopg2://augur:[email protected]:5432/augur
# if in windows:
export AUGUR_DB=postgresql+psycopg2://augur:augur@localhost:5432/augur
```
##### 启动augur

```shell
# 使用wsl或者windows中的postgres
docker compose -f docker-compose-externalDB.yml up -d
```
61 changes: 61 additions & 0 deletions README_old.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Augur NEW Release v0.42.0

[![standard-readme compliant](https://img.shields.io/badge/standard--readme-OK-green.svg?style=flat-square)](https://github.com/RichardLitt/standard-readme)

[![Build Docker images](https://github.com/chaoss/augur/actions/workflows/build_docker.yml/badge.svg)](https://github.com/chaoss/augur/actions/workflows/build_docker.yml)

[![Hits-of-Code](https://hitsofcode.com/github/chaoss/augur?branch=main)](https://hitsofcode.com/github/chaoss/augur/view?branch=main)


[![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/2788/badge)](https://bestpractices.coreinfrastructure.org/projects/2788)

## What is Augur?

Augur is a software suite for collecting and measuring structured data
about [free](https://www.fsf.org/about/) and [open-source](https://opensource.org/docs/osd) software (FOSS) communities.

We gather trace data for a group of repositories, normalize it into our data model, and provide a variety of metrics about said data. The structure of our data model enables us to synthesize data across various platforms to provide meaningful context for meaningful questions about the way these communities evolve.
Augur’s main focus is to measure the overall health and sustainability of open source projects, as these types of projects are system critical for nearly every software organization or company. We do this by gathering data about project repositories and normalizing that into our data model to provide useful metrics about your project’s health. For example, one of our metrics is Burstiness. Burstiness – how are short timeframes of intense activity, followed by a corresponding return to a typical pattern of activity, observed in a project?

This can paint a picture of a project’s focus and gain insight into the potential stability of a project and how its typical cycle of updates occurs.

We are a [CHAOSS](https://chaoss.community) project, and many of our
metrics are implementations of the metrics defined by our awesome community. You can find a full list of them [here](https://chaoss.community/metrics/).

For more information on [how to get involved on the CHAOSS website](https://chaoss.community/participate/).

## Collecting Data

Augur supports Python3.6 through Python3.9 on all platforms. Python3.10 and above do not yet work because of machine learning worker dependencies. On OSX, you can create a Python 3.9 environment this way: `python3.9 -m venv path/to/venv`.

Augur's main focus is to measure the overall health and sustainability of open source projects.

Augur collects more data about open source software projects than any other available software. Augur's main focus is to measure the overall health and sustainability of open source projects.
One of Augur's core tenets is a desire to openly gather data that people can trust, and then provide useful and well-defined metrics that help give important context to the larger stories being told by that data. We do this in a variety of ways, one of which is doing all our own data collection in house. We currently collect data from a few main sources:

1. Raw Git commit logs (commits, contributors)
2. GitHub's API (issues, pull requests, contributors, releases, repository metadata)
3. The Linux Foundation's [Core Infrastructure Initiative](https://www.coreinfrastructure.org/) API (repository metadata)
4. [Succinct Code Counter](https://github.com/boyter/scc), a blazingly fast Sloc, Cloc, and Code tool that also performs COCOMO calculations

This data is collected by dedicated data collection workers controlled by Augur, each of which is responsible for querying some subset of these data sources. We are also hard at work building workers for new data sources. If you have an idea for a new one, [please tell us](https://github.com/chaoss/augur/issues/new?template=feature_request.md) - we'd love your input!


## Getting Started

If you're interested in collecting data with our tool, the Augur team has worked hard to develop a detailed guide to get started with our project which can be found [in our documentation](https://oss-augur.readthedocs.io/en/main/getting-started/toc.html).

If you're looking to contribute to Augur's code, you can find installation instructions, development guides, architecture references (coming soon), best practices and more in our [developer documentation](https://oss-augur.readthedocs.io/en/main/development-guide/toc.html). Please know that while it's still rather sparse right now,
but we are actively adding to it all the time. If you get stuck, please feel free to [ask for help](https://github.com/chaoss/augur/issues/new)!

## Contributing

To contribute to Augur, please follow the guidelines found in our [CONTRIBUTING.md](CONTRIBUTING.md) and our [Code of Conduct](CODE_OF_CONDUCT.md). Augur is a welcoming community that is open to all, regardless if you're working on your 1000th contribution to open source or your 1st. We strongly believe that much of what makes open source so great is the incredible communities it brings together, so we invite you to join us!

## License, Copyright, and Funding

Copyright © 2112 University of Nebraska at Omaha, University of Missouri and the CHAOSS Project.

Augur is free software: you can redistribute it and/or modify it under the terms of the MIT License as published by the Open Source Initiative. See the [LICENSE](LICENSE) file for more details.

This work has been funded through the Alfred P. Sloan Foundation, Mozilla, The Reynolds Journalism Institute, contributions from VMWare, Red Hat Software, Grace Hopper's Open Source Day, GitHub, Microsoft, Twitter, Adobe, the Gluster Project, Open Source Summit (NA/Europe), and the Linux Foundation Compliance Summit. Significant design contributors include Kate Stewart, Dawn Foster, Duane O'Brien, Remy Decausemaker, others omitted due to the memory limitations of project maintainers, and 15 Google Summer of Code Students.
15 changes: 10 additions & 5 deletions augur/api/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,15 @@ def __init__(self):
self.config = self.session.config
self.engine = self.session.engine

self.cache_manager = self.create_cache_manager()
self.server_cache = self.get_server_cache()
# development 开发模式,暂且不用cache
# 因为用了cache,保存文件就会出现warning
no_use_cache = True
if no_use_cache:
self.cache_manager = None
self.server_cache = None
else:
self.cache_manager = self.create_cache_manager()
self.server_cache = self.get_server_cache()
self.app = None
self.show_metadata = False

Expand All @@ -72,7 +79,6 @@ def create_app(self):

self.app.config['WTF_CSRF_ENABLED'] = False


self.logger.debug("Creating API routes...")
self.create_all_routes()
self.create_metrics()
Expand Down Expand Up @@ -224,7 +230,7 @@ def get_metric_files(self) -> List[str]:
metric_files.append(file_id)

return metric_files

# NOTE: Paramater on=None removed, since it is not used in the function Aug 18, 2022 - Andrew Brain
def transform(self, func: Any, args: Any=None, kwargs: dict=None, repo_url_base: str=None, orient: str ='records',
group_by: str=None, aggregate: str='sum', resample=None, date_col: str='date') -> str:
Expand Down Expand Up @@ -362,7 +368,6 @@ def endpoint_function(*args, **kwargs) -> Response:
endpoint_function.__name__ = f"{endpoint_type}_" + func.__name__
return endpoint_function


def add_standard_metric(self, function: Any, endpoint: str) -> None:
"""Add standard metric routes to the flask app.

Expand Down
13 changes: 9 additions & 4 deletions augur/application/cli/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,10 @@ def cli():
@cli.command("start")
@click.option("--disable-collection", is_flag=True, default=False, help="Turns off data collection workers")
@click.option("--development", is_flag=True, default=False, help="Enable development mode, implies --disable-collection")
@click.option("--reload", is_flag=True, default=True, help="Enable gunicorn reload mode!")
@test_connection
@test_db_connection
def start(disable_collection, development):
def start(disable_collection, development, reload):
"""Start Augur's backend server."""

try:
Expand All @@ -55,14 +56,18 @@ def start(disable_collection, development):
os.environ["AUGUR_DEV"] = "1"
logger.info("Starting in development mode")


with DatabaseSession(logger) as session:

gunicorn_location = os.getcwd() + "/augur/api/gunicorn_conf.py"
host = session.config.get_value("Server", "host")
port = session.config.get_value("Server", "port")

gunicorn_command = f"gunicorn -c {gunicorn_location} -b {host}:{port} --preload augur.api.server:app"
if not reload:
gunicorn_command = f"gunicorn -c {gunicorn_location} -b {host}:{port} --preload augur.api.server:app"
else:
gunicorn_command = f"gunicorn -c {gunicorn_location} -b {host}:{port} --reload augur.api.server:app"

server = subprocess.Popen(gunicorn_command.split(" "))

time.sleep(3)
Expand Down
2 changes: 1 addition & 1 deletion augur/tasks/data_analysis/message_insights/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ def read(filename):
'h5py~=3.6.0',
'scikit-image==0.19.1',
'joblib==1.0.1',
'xgboost',
'xgboost==1.4.2',
'bs4==0.0.1',
'xlrd==2.0.1',
'gensim==4.2.0'
Expand Down
2 changes: 1 addition & 1 deletion augur/tasks/init/celery_app.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
'augur.tasks.data_analysis.discourse_analysis.tasks',
'augur.tasks.data_analysis.pull_request_analysis_worker.tasks']

tasks = start_tasks + github_tasks + git_tasks + data_analysis_tasks
tasks = start_tasks + github_tasks + git_tasks # + data_analysis_tasks

redis_db_number, redis_conn_string = get_redis_conn_values()

Expand Down
1 change: 0 additions & 1 deletion augur/tasks/start_tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@


from augur.tasks.github import *
from augur.tasks.data_analysis import *
from augur.tasks.github.detect_move.tasks import detect_github_repo_move
from augur.tasks.github.releases.tasks import collect_releases
from augur.tasks.github.repo_info.tasks import collect_repo_info
Expand Down
Empty file added augur/util/__init__.py
Empty file.
3 changes: 3 additions & 0 deletions docker-compose-externalDB.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ services:
dockerfile: ./docker/backend/Dockerfile
volumes:
- facade:/augur/facade
- ./repo_groups.csv:/repo_groups.csv
- ./repos.csv:/repos.csv
- ./augur:/augur/augur # for development.
restart: unless-stopped
ports:
- 5002:5000
Expand Down
12 changes: 12 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@ services:
dockerfile: ./docker/backend/Dockerfile
volumes:
- facade:/augur/facade
- ./repo_groups.csv:/repo_groups.csv
- ./repos.csv:/repos.csv
- ./augur:/augur/augur # for development.
restart: unless-stopped
ports:
- 5002:5000
Expand All @@ -41,6 +44,15 @@ services:
- augur-db
- redis

augurface:
image: augurlabs/augurface:latest
build:
context: .
dockerfile: ./docker/augurface/Dockerfile
ports:
- "127.0.0.1:8088:8080"
depends_on:
- augur
volumes:
facade:
driver: local
Expand Down
8 changes: 6 additions & 2 deletions docker/backend/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,16 @@ COPY ./scripts/ scripts/

#COPY ./docker/backend/docker.config.json .
RUN python3 -m venv /opt/venv
ENV VIRTUAL_ENV=/opt/venv
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

RUN set -x \
&& /opt/venv/bin/pip install .
# add -e for editable.
RUN set -ex \
&& pip install -e .

RUN ./scripts/docker/install-go.sh
# RUN ./scripts/install/workers.sh
# RUN ./scripts/docker/nltk_fix.sh

RUN mkdir -p repos/ logs/ facade/

Expand Down
2 changes: 0 additions & 2 deletions docker/backend/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,7 @@ if [[ "$REDIS_CONN_STRING" == *"localhost"* ]] || [[ "$REDIS_CONN_STRING" == *"1
else
export redis_conn_string=$REDIS_CONN_STRING
fi

./scripts/install/config.sh $target

if [[ -f /repo_groups.csv ]]; then
augur db add-repo-groups /repo_groups.csv
fi
Expand Down
2 changes: 2 additions & 0 deletions repo_groups.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
20,chenkx
21,ckx-new
4 changes: 4 additions & 0 deletions repos.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
https://github.com/ckxkexing/augur.git,20
https://github.com/ckxkexing/perceval.git,20
https://github.com/ckxkexing/airflow-jobs.git,20
https://github.com/ppy/osu.git,21
Loading