Scarabaeus Coding Conventions#

Last revised by Z. Ellis on 2025 MAR 5

Introduction#

This document describes conventions for developing in the Scarabaeus library. Much of these conventions are derived from Python’s PEP 8 style guide; wherever some convention is not explicitly stated in the following guide, refer to it for guidance.

This document will not cover writing docstrings and instead will only note where they should be included. Be sure to follow the format described in the docstring formatting section of the External Resource Maintenance Guide. It’s recommended that you first start with this guide and focus on learning how the code itself should be structured. Once you feel comfortable with that, then move on to the docstring formatting guide as it contains a different set of information that isn’t important for writing the code itself.

The coding guidelines layed out below exist to serve two purposes:

  • to ensure a coherent, readable protocol while developing for Scarabaeus

  • to make maintaining Scarabaeus’s documentation as easy as possible

Each section of this document explains a section of a given template script. Following along in order should explain the purpose and formatting of each section, allowing us to build the template over the course of this reading.

One final note to remember from the PEP 8 style guide:

“… code is read much more often than it is written… know when to be inconsistent - sometimes style guide recommendations just aren’t applicable. When in doubt, use your best judgment. Look at other examples and decide what looks best.”


General Rules#

Before we go line by line through the example script, let’s quickly cover a few general rules that you should follow when writing code in Scarabaeus. You’ll notice throughout the example script that these rules are followed.

Commenting#

Scarabaeus is a large project. It contains many scripts written by many different people with many different educational backgrounds, native languages, and coding styles. This is great for collaboration, but it can also mean that, even without the complexity of the project, understanding a piece of code can sometimes be difficult.

In order to reduce these kinds of issues, it’s vital that you comment your code as thouroughly as possible. Some people like to block their code out with comments, others prefer to write all the code and come back through at the end to comment. Some switch between the two styles, or mix them, or have their own completely different style. The way in which you comment does not matter, so long as you’re commenting!

From the PEP 8:

“Ensure that your comments are clear and easily understandable to other speakers of the language you are writing in.”

If you’re worried that you’re over-commenting — you’re not. Your fellow developers, and maybe even future you, will be thankful that you took the extra 10 seconds to write down what a line of code does.

Type-Hinting#

Type-hinting is an important extension of commenting. It tells both users and developers what should be going where. It appears in the online documentation (where you are right now), as well as in the code through autocompletion tools like Intellisense.

You should type-hint both arguments and returns. You can see how this is done throughout the Code Layout section.

Naming Conventions#

Both method and variable names should be written in snake_case. When assigning names, try for succintness, but don’t overdo it. We all want our variables and methods to have nice compact names, but it’s important that they’re still legible. Say we want to abbreviate the name covariance_matrix. It’s long, and we could cut a few letters to make it briefer. Something like cv_mt is nice and short, but it isn’t immediately obvious what we’re talking about. Instead, a middle ground like covar_mat would be a better choice. We’ve still shortened our name, but we haven’t lost what it’s supposed to represent.


Code Layout#

In the following sections of the Code Layout, we will walk through each component of a template script TemplateClass.py and explain the reasoning behind the decisions that have been made. Note that the entire file is available to view here, as well as in Scarabaeus itself under docs/TemplateClass.py and contains more comments to supplement it. Many of these comments have been removed in the following code snippets for brevity.

File Header#

The first component of a Scarabaeus script is the File Header.

Many file names abbreviate their respective class. The header defines the full name of the class to be described by the contents of the script. For example, the ArrayWUnits file name, as well as class name, are an abbreviation of its full name “Array With Units”.

Our template script, with file name TemplateClass, will thus have a header that looks like:

#===========================#
#   CLASS: TEMPLATE CLASS   #
#===========================#

Note that the header text is set between two bars of equals signs. The text itself is centered in these bars with three spaces on either side. All three lines are book-ended with another hash symbol for consistency.

Notes and To-Do’s#

Below the file header, we should place any Notes or To-do’s that might be relevant to the script.

  • # NOTE’s are used to make anyone looking at your script aware of some information. For example, we could tell someone where to find this guide for more information on our template script.

  • # TODO’s are used to track any changes or updates that might need to be made in the code.

Sometimes it makes more sense to leave a Note or Todo directly in the code. If this occurs, make sure to also mention it along with its location at the top of the script with all of the other Note’s and Todo’s. This ensures that they can be seen at a quick glance and won’t be lost deep down in the code.

With these guidelines set, the next piece of our template script becomes:

#===========================#
#   CLASS: TEMPLATE CLASS   #
#===========================#
# NOTE: file name should reflect title above with CamelCase

# TODO: mark any planned functionality, methods, or changes/updates that need to be made for the class
#       here using the TODO identifier. Also note any other TODO's present in methods below

Imports#

Our first section of the script contains our imported libraries. Each section title is placed between two bars of dashes with the text centered in the middle with two spaces. Like the file header, all three lines are book-ended by hash marks.

If you’re importing any other classes from within Scarabaeus, you should do so here to increase performance like so:

#-----------#
#  Include  #
#-----------#
import scarabaeus as scb                    # standardized convention for shortening Scarabaeus
from scarabaeus import(Units, ArrayWUnits, # directly import Scarabaeus modules
                    Body)

# some commonly used (but not required) imports
from scarabaeus import Constants            # contains all of the values that should be kept constant across Scarabaeus

import csv                                  # used to read csv files, a common file type used within Scarabaeus
from typing import Literal                  # useful for method inputs, see selector_method_template() for more
from typing import Self                     # used to type-hint a return of the class itself

import numpy as np                          # offers many standard physical constants and math operations
import numpy.typing as npt                  # allows for type-hinting numpy classes like matrix, array-like, etc

Many commonly used imports have been included here in this template, but note that nothing is required unless you need to use it. The third import, Constants, contains values for many physical constants, as well as parameters for planets and other celestial objects. For more information, see the User Guide section on Constants.

Generating Units#

The next section, Generating Units, is commonly used, but not always required. Many times, you may need to use units in your script. If you do, include this section. Otherwise, skip it entirely.

Note

Remember to follow the section titling conventions outlined in the Imports section.

We will be using units in our example script, and so the following will go over the different ways you might consider implementing them:

#------------------#
#  Generate Units  #
#------------------#
# NOTE: don't include this section if no units are required

# if you know you'll be working with specific units, you can generate them here before the class
km, N = Units.get_units(['km', 'N'])

# you can also generate all of the base units in one line
g, m, sec, rad = Units.get_units('base')

# or create more specific units that you might need
grav_param_unit = km**3*sec**-2
  • The first comment shows how units should be defined in your script using UnitsArray. This is the recommended way to define units in Scarabaeus.

  • The second comment shows how to generate all base units. This is useful while developing in the scenario where you are unsure of what units you will be using.

Note

Remember to clean these up once you’ve finalized what units you’re using in your script, as defined by the first comment.

  • Finally, the last commment provides an example for how to create your own compound units. Many times, you might need to use some units that aren’t in Scarabaeus’ unit library by default. Instead of continuosly redefining these units, define them once during setup and use them throughout the script. Also note that this unit already exists in Units as 'mu', but we chose to build it ourselves as an example.

Class Definition and Class Constants#

Now that we’ve prepared our script, it’s time to start writing code for the new object itself. First we need to define the class and and any class-level constants it might contain.

Note that the Class Docstring should be placed here as well. Refer to the docstring formatting guide mentioned at the top of this article for guidance on writing this important component of the class.

#--------------------#
#  Class Definition  #
#--------------------#
class TemplateClass(Body):
    """
        CLASS DOCSTRING HERE
    """
    #---------------------------#
    # region    Class Constants #
    #---------------------------#
    # NOTE: place class constants before anything else. For more examples,
    #       see Dimensions.py or Units.py
    cls_const_1 = 1

    cls_const_2 = np.array([[1, 2, 3],
                            [4, 5, 6],
                            [7, 8, 9]])

    # endregion Class Constants #
    #---------------------------#

Class constants don’t crop up very often, so most of the time you won’t need to include this section. If, like Units.py, you do need to include class constants, make sure you denote them using a collapsible section header. There are a few of these “region”/”endregion” headers throughout the script. They serve two purposes:

  1. They allow developers to collapse large sections easily by clicking the down arrow beside the line number

  2. They help organize and provide visible landmarks in the side view in IDE’s like VSCode

As we continue writing our script, you’ll notice that we use them to break it up in a similar way to how this document is organized. This is an important part of writing clean, readable code.

Initialization#

Now it’s time to write our __init__ method. We’ll place another header above it, but here we don’t include an “endregion” delimiter because we don’t need it to be collapsible. However, we retain the “region” delimiter so that “Constructor” becomes a side view landmark in our IDE.

#--------------------#
# region Constructor #
#--------------------#
def __init__(self, arg_one : int | str, arg_two : ArrayWUnits = None, arg_three : str = None):
    # if class is a child of another, perform parent class initializations with super() first
    super().__init__('DEFAULT INPUT', arg_one, arg_two)

    # define the rest of the class properties here
    self._prop_one    = arg_one                             # no extra input validation
    self.prop_two     = arg_two                             # calling this property's setter, no "_"
    self._prop_three  = self.__utility_template(arg_three)  # calling a private utility class, see __utility_template() for more

Since TemplateClass is a child of Body , we’ll need to make sure that we pass any necessary parent arguments using super() first. Once that’s taken care of, we can start assigning values to our properties. This can be done in one of two ways:

  1. Direct assignment via self._ - if we don’t need to validate our input at all, we can set it directly here

  2. Input validation and/or conditioning via self. - if we need to perform some extra logic before we store a value, we’ll first pass it to its property’s setter

All properties of a class should be assigned to their private version (prefixed by “_”), but depending on the situation we might need to perform some logic before doing so.

In this example, we don’t want to do anything with arg_one before we assign it to a property, so we can save it directly with self._prop_one = ``. For ``prop_three, we pass arg_three to a utility function (more on those later) first, so we can do any conditioning or input validation there. Since we know that the value we’re setting it to will take the form we expect, we can directly assign out third property with ``self._prop_three = `` .

But, let’s say that we do need to do a little more work with arg_two before saving it. Here, we can pass it through prop_two ‘s setter before we actually assign it to _prop_two . We invoke the property setter by omitting the “_” with ``self.prop_two = `` . In the next section, we’ll talk about how to properly define property setters, as well as getters.

Properties#

As we saw in the section above, most of the initial data processing for a class happens in the property setters rather than the __init__ method itself. This lets us de-clutter and compartmentalize our code. Properties are important for two reasons:

  1. They allow us to create an effective data structure. ArrayWUnits.units.dimensions is intuitive and easily accessible.

  2. They make documentation easier. Discretizing the components of a class in this way gives a clear picture of what is and is not available from a given object.

To support this, it’s important that we’re diligent with our Properties Section. Each property consists of a few components:

  1. The @property decorator - this tells Python (and Sphinx) that we’re creating a property getter.

  2. The property definition - this is placed below the decorator and is given the same name as the property we’re creating it for. Remember to pass self.

  3. The propery docstring - just like we did in the section above, we’ll only note that this docstring belongs here. See the docstring formatting guide for populating this.

  4. The return statement - a simple return self._property. This ensures that whenever the property is requested, its private version is returned.

For every property that we define in our __init__ , we must also ensure that we create a getter with this format in the Properties Section. Order them in the same way that they’re ordered in the __init__ . Since we defined them one, two, three above, we’ll define them one, two, three here.

#----------------------#
# region    Properties #
#----------------------#
@property
def prop_one(self):
    """
        PROPERTY DOCSTRING HERE
    """
    return self._prop_one

# NOTE: create a property setter when input validationis required. This reduces clutter in the __init__file.
#       Example below: we need to check the units of an ArrayWUnits. First, define the property itself:
@property
def prop_two(self):
    """
        PROPERTY DOCSTRING HERE
    """
    return self._prop_two

# NOTE: now, implement logic (in this case input validation) in the property's setter method
@prop_two.setter
def prop_two(self, input_val : ArrayWUnits):
    # let's assume that we expect arg_two to be given in grav_param_units (kg^3/s^2) defined in the # Generate Units # section:
    if isinstance(input_val, ArrayWUnits):
        if input_val.units != grav_param_unit:
            # the given units are incorrect, raise an error
            bad_unit_err = ('Argument [arg_two] should be an ArrayWUnits object with Units of km^3/s^2. '
                            f'Received: {input_val.units}')
            raise ValueError(bad_unit_err)

        else:
            # given units are correct, allow property definition
            self._prop_two = input_val

    elif isinstance(input_val, type(None)):
        # in this case, since the default value for arg_two is 'None', an input value of 'None' must also be valid
        self._prop_two = input_val

@property
def prop_three(self):
    """
        PROPERTY DOCSTRING HERE
    """
    return self._prop_three

# endregion Properties #
#----------------------#

Note

Don’t forget the “region”/”endregion” headers encasing your Properties Section.

Notice that after we defined the getter for prop_two , we added another function below it. This is a property setter, and it’s extremely important for input validation and conditioning. Any time that you want to make sure an input is of the correct type, you want to store/ingest it in some specific way (conditioning), or you need to perform some other kind of logic, place it in that property’s setter. A setter consists of:

  1. The @prop_name.setter decorator - this tells Python that whenever prop_name is assigned a value it needs to run through this code first.

  2. The setter definition - this is identical to the property’s getter definition, except we also need to take in the value we’re checking itself, called input_val .

  3. The setter logic - whatever code we want to run input_val through. It needs be structured in a way that the private property is assigned.

In the example above, we’re making sure that ``prop_two `` has been given in the correct units. Look through the code and notice that, if given the correct units, we assign the private property with ``self._prop_two = `` .

Operator Overloading#

There are many reasons why you might want to overload the default operators of your class, the most common being the __repr__ method. Here we’ll quickly go over how this should be done in Scarabaeus.

Just like the Properties Section, we’ll mark our Operators Section with “region”/”endregion” headers. Because operators are not documented, you don’t need anything else in their docstring besides a note that it’s being overloaded, like we wrote below.

Note

Overloaded operators are not documented, so it’s important if you change any logic for operators like __add__, __mul__, etc that you note these modified behaviors in the Notes Section of your Class Docstring.

#---------------------#
# region    Operators #
#---------------------#
def __repr__(self):
    """
        Overloading the printable representation operator for a TemplateClass object.
    """
    return 'TemplateClass object'

# endregion Operators #
#---------------------#

In this scenario, we just make __repr__ return a static “TemplateClass object” string. For examples of a more complex Operators Section, look at ArrayWUnits.py or EpochArray.py.

Methods#

The final, and most likely largest, component of your script is its Methods Section. Because methods are so bespoke, there isn’t much in terms of coding convetions to go over. Remember to include the “Methods” header, type-hint your inputs and returns, follow the docstring formatting guide, and comment your code.

The only extra piece of information you’ll need for methods is to remember that private helper methods should be prefixed with a two underscores. Look at the two methods below:

#----------------#
# region Methods #
#----------------#
def __utility_template(self, file_name : str) -> np.array:
    """
        METHOD DOCSTRING HERE
    """
    # read in data from the input file
    file = open(file_name, "r")
    data = list(csv.reader(file, delimiter = ","))
    file.close()

    # return as a numpy array
    return np.array(data)

def general_template_method(self, method_arg : int) -> int:
    """
        METHOD DOSCTRING HERE
    """
    return method_arg * 1

The first method, __utility_template(), is the function we called all the way back at the start of this guide in our __init__ when we were assigning our properties. Since it’s an internal utility method, we add the two underscores to note that it should only ever be called within our class.

The second method, general_template_method(), is a basic method that would be used outside of our class.


Further Reading#

Congratulations, you are now ready to start writing classes for Scarabaeus! Before you complete your code, make sure to read through the Scarabaeus External Resource Guide. Welcome to the project, we’re excited to see what you do!