Provide basic downloads statistics to OBS via a REST API. To do that this script downloads the apache log files, convert it into a format fast to scan (using a BerkeleyDB container), analyze the result and store the data in a Redis store.
The information is always available via a REST API implemented with the microframework Flask, that query the data in the Redis store.
Install dependencies:
pip install -r requirements.txtConfigure the URL where are stored the OBS logs files:
vi obstat.cfgCreate the BerkeleyDB containers:
./log2db.sh 2014-01-01Store the information in Redis:
./analyze.sh 2014-01-01This last scripts takes care of not duplicating the information, analyzing twice the same file, so for example, a new call to the analyze script will skip the files already in the database.
Setup the REST API:
python api.pyWe publish two different metrics, the number of simple hits and the number of visits (different IPs) in an object. Each metric can be queried for five different objects: packages, architecture, repository, project and time.
The URL pattern for hist is:
/hits/<unit>/<date>/[[[[<project>]/<repository>]/<arch>]/<package>]
and for visits is:
/visits/<unit>/<date>/[[[[<project>]/<repository>]/<arch>]/<package>]
The <unit> slot is can be 'day', 'week' or 'month'. Depending of
this slot, <date> can have a different format:
| unit | date format | example |
|---|---|---|
| day | YYYYMMDD | 20140215 -- Single day 2014/02/15 |
| week | YYYYWW | 201401 -- First week of 2014 (2013/12/30 to 2014/02/5) |
| month | YYYYMM | 201401 -- First month of 2014 (2014/01/01 to 2014/01/31) |
Examples of valid requests:
/hits/day/20140215/home:tiwai:pa-fix-13.1/standard/x86_64/pulseaudio-module-lirc-4.0.git.270.g9490a-6.1.x86_64.rpm
/hits/day/20140215/home:tiwai:pa-fix-13.1/standard/x86_64
/visits/week/201403/home:oprudkyi
/visits/month/201402/openSUSE:13.1/standard/x86_64