Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
codereview-persefone-model-may-2024.md 17.88 KiB

persefone-model code review, May 2024

The code is well-structured into understandable units, both in terms of directories and files, as well as functions of reasonable size.

Every function has a docstring that is clearly written. Functions and variable names are easy to understand.

The documentation is clearly written and well-structured and the website is nicely illustrated.

The actual simulation code is behind two different macro systems: the Agents.jl macro system as well as Persefone's built-in domain-specific language (DSL). This makes the overall system hard to understand, modify, and validate. If possible, a simpler design without macros should be considered.

General comments

  • use of eval inside simulation code: this seems to me both an unnecessary reduction of performance as well as making binary packaging harder, meaning a binary distribution of Persefone will always have to

  • The domain-specific language (DSL) for defining species is complex with nested macro calls and large amounts of generated code. I found it takes quite a bit longer to understand what actually happens under the hood. Using @macroexpand one can see the generated code in the REPL and this is quite large for the skylark model, and interspersed with comments from macro expansion.

    • If possible, I would suggest using normal types and functions instead of a macro-based DSL.
  • Some macros could be converted into functions, which would make the code easier to understand. The macro form is currently necessary for use inside the DSL, but the macro form also gets used outside of the DSL. Examples:

    • @param
    • @rand
    • @shuffle!
    • @chance
  • Hardcoded magic constants in some functions, should be made global constants or perhaps in struct model. Giving these constants names makes the code clearer, and also makes it easier for users of the package to refer to these numbers if necessary. It might also be useful to give these constants physical units (see next point).

  • Use physical units (Unitful.jl) for clarity, documentation, and catching accidental unit mismatch errors, both in Persefone as well as for users of the package. Some examples:

    • LAItotal
    • LAIgreen
    • GDD
    • conversionfactor # hectares to pixels
  • Hardcoded pixel size of 10m * 10m = 100 m^2, this should be a global const with physical units or part of struct model.

  • Coding style

    • some overuse of && and ternary operator cond ? a : b, some would probably be better as if statements
    • sometimes no spaces between operators, e.g. * for string concatenation, makes the code harder to read
    • perhaps explicit return statements at the end of functions for clarity, but this is a personal preference of mine that is not universal in the Julia ecosystem
  • Minimum julia version should be 1.10 due to this bug opened by Daniel https://github.com/carlobaldassi/ArgParse.jl/issues/121 which was closed recently recommending upgrading to a newer julia version.

  • Docstrings for functions should state what meaning and type of the return value is.

  • Possible workflows (Github) / pipelines (Gitlab) for automation

    • CI: run unit tests after each commit and for each candidate merge request
    • Docs: build docs automatically after each commit
    • CompatHelper: create pull request when dependencies have new versions
    • TagBot: tag releases to be incorporated in the Julia package registry (once Persefone is released there)
    • benchmark: automatically run benchmarks to catch performance regressions, see also PkgBenchmarks.jl
  • Use PrecompileTools.jl for precompilation and also check for invalidations etc. to reduce package loading time.

  • Replace loops in the style of for i = 1:length(a) with for i in eachindex(a), this makes the for loop independent of 1-based or 0-based indexing etc.

  • make test / test/runtests.jl: no progress visible, hard to see what is currently happening. I use the following in my packages:

    # this function goes in `test/runtests.jl`
    # show which testset is currently running
    showtestset() = println(" "^(2 * Test.get_testset_depth()), "testing ",
                            Test.get_testset().description)
    
    # call showtestset() in each @testset
    @testset verbose=true "testset_name" begin
        showtestset()
        # tests here ...
    end
  • It might make sense to use type-based dispatch (on the type of the agent) instead of having code of this form:

    for a in allagents(model)
        (typeof(a) != Animal) && continue
        pops[a.name] += 1
    end
    • A different alternative is to have an allanimals function.

General comments on testing

CI (Continuous integration), recommended

https://en.wikipedia.org/wiki/Continuous_integration

There doesn't seem to be a CI (Continuous integration) runner set up, that runs the unit tests after every commit to master on a matrix of OS and julia version combinations.

This should also be activated in merge requests, so that nothing is merged to the master branch that fails the unit tests.

Julia packages on Github typically include a little badge at the top of the README indicating the CI status in the form of "Run tests: passing".

Aqua.jl tests, recommended

https://github.com/JuliaTesting/Aqua.jl

The package Aqua.jl runs some automatic quality assurance checks for common problems in Julia codebases, for example (excerpt from the link above):

  • There are no method ambiguities.
  • There are no undefined exports.
  • There are no unbound type parameters.
  • There are no stale dependencies listed in Project.toml
  • etc.

Once the issues highlighted by the Aqua tests are fixed, the Aqua tests should be run as part of the unit tests.

Julia packages on Github usually have a little badge "tested with Aqua.jl" at the top of the README.

Code coverage testing, recommended

Code coverage when running unit tests should be recorded, as code that is not tested during unit tests or manual testing might fail due to even a simple typo. Many Julia packages on Github display a badge showing the code coverage of their package when running the unit tests as a percentage. This can be integrated into the workflows (Github) / pipelines (Gitlab) and recorded and updated automatically. When creating a merge request, the change in coverage can also be recorded, alerting developers when they add significant amount of code without corresponding tests.

Some relevant packages:

External services:

  • codecov.io
  • coveralls.io

Optional: enforce consistent coding style

If a consistent coding style is desired, a code formatter can be run automatically to ensure any code committed to the repository follows the coding style guidelines chosen.

Coding style does not affect code correctness, and therefore is less important. It also usually causes a large commit when changing the code base to the formatting style, unless the coding style was used from the beginning.

Code formatting can be checked before merging a merge request, to always ensure proper coding style. README badges often used on Github Julia packages for coding styles as well.

JuliaFormatter is the package usually used for code formatting: https://github.com/domluna/JuliaFormatter.jl

Some popular coding styles in the Julia ecosystem:

Optional: JET.jl testing

https://github.com/aviatesk/JET.jl

JET.jl can be used to detect type instabilities which can cause slow performance, as well as other bugs.

This package depends on the internals of the Julia compiler, and therefore results can change between Julia versions.

persefone-model code

Project.toml

All packages listed under [deps] should have an entry in the [compat] section (this would also be flagged by an Aqua.jl test).

Version bounds on a compat entry of the form ">=5.6" can cause errors in the future, as Julia uses semantic versioning (semver) throughout the ecosystem to indicate API compatibility. A specifier of ">=5.6" means that any version greater than "5.6.0" is acceptable, but future major version releases "6.x.y", "7.x.y" will have API changes compared to version "5.6.x", potentially causing the code to not work or generate incorrect results.

If a dependent package is known to work for multiple major versions, these should be listed explicitly separated by commas.

Example: the Agents.jl dependency works for versions "^5.6" and "^6", so the compat section should read "5.6,6", indicating that the range of accepted versions is "[5.6.0, 7.0.0)".

See:

Persefone.jl

Line 16

Explicitly listing which functions are imported with using is usually preferred. This makes it clearer from which package the functions used come from.

Lines 120-121

PrecompileTools.jl should be used so that calls to precompile only run during precompilation, and not everytime when loading the package via using Persefone.

analysis/makeplots.jl

It might make sense to decouple CairoMakie from the model code.

See https://git.idiv.de/persefone/persefone-model/-/issues/81 for details.

Line 15: function visualisemap

Type annotations on the date and landcover arguments would make it clearer which types can be passed in. Alternatively it could be stated in the docstring.

core/input.jl

Line 12

The code assumes / is the path separator. Does this work on windows, where it is \? Not sure if there is perhaps some on-the-fly translation on Windows.

Better: joinpath(pkgdir(Persefone), "src", "parameters.toml")

Line 16-17

I think this comment can be removed.

Lines 19-35: @param

Why does this have to be a macro? Why not:

param(model::AgentBasedModel, domain::AbstractString, par::AbstractString) = model.settings["$domain.$par"]
# or
param(model::AgentBasedModel, domainparam::AbstractString) = model.settings[domainparam]

Lines 118-149: @rand, @shuffle!, @chance

Why are these macros? Why not:

rand(model::AgentBasedModel, args...) = rand(model.rng, args...)
shuffle!(model::AgentBasedModel, collection) = shuffle!(model.rng, collection)
chance(odds) = rand(model.rng) < odds
# better nomenclature instead of 'odds'
chance(prob) = rand(model.rng) < prob

Note on nomenclature: odds is defined as the ratio p / (1 - p) where p is a probability between 0 and 1. What is called odds here is a probability.

Line 180, in function parsecommandline

Perhaps the minimum julia version for Persefone should be set to 1.10, and the comment can be removed. According to https://github.com/carlobaldassi/ArgParse.jl/issues/121 the fix is to upgrade julia.

Line 183, in function parsecommandline