Generate a database
Anubis halps you to generate a database. Use the CLI to get started
>anubis timemachine --new
This will create in your local folder an inputfile ./anubis_time_machine.yml.
Edit this inputfile to match your needs:
# Anubis time machine input file
git_path : /Users/dauptain/GITLAB/avbp
branch : dev
rel_source_path : ./SOURCES
year_start : 2017
year_end : 2022
out_dir : ANUBIS_AVBP
lizard_switch : True
git_pathis a local path to your.gitrepository.branchis the name of the branch you want to scan. It is often `master``rel_source_pathis a subfolder in your repository. Used to limit cloc and lizard investigation. Usually pointing to the core of the source code (avoiding tests, documentations, templates, etc…)year_startandyear_enddefines the time frame. These are integers.out_dir, where you want to save you Anubis databaselizard_switch, used to deactivate the lizard analysis, which can be quite slow.
Then you can start the database generation with :
> anubis timemachine --file ./anubis_time_machine.yml
the execution will show you this
(...)
ETA : 0:00:01.121897 sec
timewarping (55/72)... date is now 2021-7-01 12:00
... find last revision
... checkout revision eccfd209bbf9eb90f91645ff8ace621b5441f41e
... running cloc
... gather commit stats
... get git blame info
... branch status
... running lizard
ETA : 0:01:09.197262 sec
timewarping (56/72)... date is now 2021-8-01 12:00
... find last revision
... checkout revision 89534c42e102fd5db217cf290b564313187281d4
... running cloc
... gather commit stats
... get git blame info
... branch status
... running lizard
ETA : 0:18:13.801281 sec
timewarping (57/72)... date is now 2021-9-01 12:00
... find last revision
... checkout revision 02da5a82c8f765982e05153d477e727552a5ce30
... running cloc
... gather commit stats
... get git blame info
... branch status
... running lizard
(...)
With this output, you know when the git repos is set (here Sept. 2021), and you get a rough Estimated Time of Achievement or ETA (here 18 minutes).
Database content
The database looks like:
ANUBIS_Nek5000
├── anubis_2017-01
│ ├── blame.json
│ ├── branch_status.json
│ ├── cloc.json
│ ├── commits.json
│ └── lizard.csv
├── anubis_2017-02
│ ├── cloc.json
│ ├── blame.json
│ ├── branch_status.json
│ ├── commits.json
│ └── lizard.csv
├── anubis_2017-03
│ ├── cloc.json
│ ├── blame.json
│ ├── branch_status.json
│ ├── commits.json
│ └── lizard.csv
(...)
Indeed, each month get its own subfolder, with several databases/
Blame
The blame.json gives the following information for each non blank line of code of each file:
the author
the date of last modification
the indentation level (ie the number of blank before the first non-blank character)
the line number
[
{
"file": "src/hello_world.py",
"author": [
"Aurélien",
"Aurélien",
"Aurélien",
"Aurélien",
],
"date": [
"2021-05-11",
"2021-05-11",
"2021-05-11",
"2021-05-11",
],
"indentation": [
0,
4,
0,
4,
],
"line_number": [
1,
2,
3,
4
]
},
]
Branch status
The branch_status.jsongives the number of commit behind and ahead for each branch compared to the branch specified in the timemachine input .yml file, called the reference branch.
It also stores the number of commits on the reference branch.
[
{
"branch": "remotes/origin/RELEASE/7.0",
"behind": 906,
"ahead": 343,
"nb_commits_ref_branch": 5219
},
{
"branch": "remotes/origin/RELEASE/7.0.1",
"behind": 577,
"ahead": 435,
"nb_commits_ref_branch": 5219
},
{
"branch": "remotes/origin/RELEASE/7.0.1-SEP16",
"behind": 577,
"ahead": 393,
"nb_commits_ref_branch": 5219
},
{
"branch": "remotes/origin/WIP/EM2C-BiPeriodic_Channel_TBLE",
"behind": 577,
"ahead": 340,
"nb_commits_ref_branch": 5219
},
{
"branch": "remotes/origin/WIP/volumic_temporals",
"behind": 577,
"ahead": 321,
"nb_commits_ref_branch": 5219
}
]
CLOC, or Count Lines of Code
The cloc.json gives the code sizes:
{
"header": {
"cloc_url": "github.com/AlDanial/cloc",
"cloc_version": "1.92",
"elapsed_seconds": 2.983234167099,
"n_files": 1195,
"n_lines": 519864,
"files_per_second": 400.571974261766,
"lines_per_second": 174261.881864116
},
"Fortran 90": {
"nFiles": 1166,
"blank": 83203,
"comment": 82278,
"code": 347867
},
"C": {
"nFiles": 16,
"blank": 1174,
"comment": 722,
"code": 2806
},
"Python": {
"nFiles": 1,
"blank": 163,
"comment": 67,
"code": 496
},
"make": {
"nFiles": 1,
"blank": 49,
"comment": 71,
"code": 266
},
"C/C++ Header": {
"nFiles": 9,
"blank": 152,
"comment": 155,
"code": 167
},
"Bourne Shell": {
"nFiles": 1,
"blank": 14,
"comment": 18,
"code": 162
},
"CMake": {
"nFiles": 1,
"blank": 7,
"comment": 7,
"code": 20
},
"SUM": {
"blank": 84762,
"comment": 83318,
"code": 351784,
"nFiles": 1195
}
}
Commits
The commits.json gives stats about commits:
[
{
"author": "Aurelien PERROT <perrot@cerfacs.fr>",
"date": "Sat Jan 29 21:59:55 2022 +0100",
"files": 1,
"insertions": 1,
"deletions": 0,
"revision": "872de25e0372d907bbe366e09a9049c05aa609c2"
},
{
"author": "Victor Xing <xing@cerfacs.fr>",
"date": "Tue Jan 11 16:37:28 2022 +0100",
"files": 1,
"insertions": 3,
"deletions": 0,
"revision": "1676e6231f444e6ba924bed689312e019d466f92"
},
{
"author": "Gabriel Staffelbach <gabriel.staffelbach@cerfacs.fr>",
"date": "Tue Jan 25 14:21:35 2022 +0100",
"files": 2,
"insertions": 0,
"deletions": 9,
"revision": "0cbe1e4464204f48bbef912428396a00089b6a82"
}
]
Complexity, with lizard
The code complexity is found with Lizard. This is computed for each function, i.e. :
Function
Subroutine
Method
etc..
The file itself looks like
NLOC,CCN,token,param,size,function@line@file,file,function,call,start,end
54,5,610,2,110,"add_vortex@36-145@./SOURCES/TOOLS/INIT/add_vortex.f90","./SOURCES/TOOLS/INIT/add_vortex.f90","add_vortex","add_vortex( grid , vortex )",36,145
183,33,1629,2,324,"gas_out_main@28-351@./SOURCES/TOOLS/INIT/gas_out_main.f90","./SOURCES/TOOLS/INIT/gas_out_main.f90","gas_out_main","gas_out_main( grid , gas_out )",28,351
(...)
Here some explanations are required:
NLOCis the number of lines of code, per functionCCNis the cyclomatic complexity number, as defined by Mc Cabe.tokenis the number of tokens (let say “words”) in the functionparamthe nb of parameters detectedsizethe nb of charactersfunction@line@filethe full location of the contextfilethe file of the contextfunctionthe name of the functioncallthe way the function was calledstartline start in the fileendline end in the file