Skip to content

.best_fit_curve()

Jacob Morris edited this page Jun 8, 2019 · 5 revisions

.best_fit_curve()

Parameters can be logged with all of the timing methods on Timer. Those arguments, along with the measured time provide independent variable(s) and a dependent that can be used to determine a best-fit-curve describing the relationship between the parameters and execution time. There are currently four best-fit methods:

  • Exponential - of the form a + b*ex, there can only be one independent variable
  • Linear - of the form b + x_0x0 + ... + x_nxn, there can be any number of independent variables as multi-variable linear regression is performed
  • Logarithmic - of the form a + b*log(x), there can only be one independent variable
  • Polynomial - of the form ax2 + bx + c, there can only be one independent variable

Very Important: The number, type, and name of any positional or keyword arguments must be the same for every run in a split when determining the best-fit-curve.

Important: There can only be more than one logged parameter for the Linear curve type and all parameter values must be integers. To accomplish this, irrelevant positional and keyword parameters can be excluded with exclude. Just pass a set of the index position or keyword argument name. Argument transformer function(s) can be passed in one of two ways:

  • transformers=<callable>, where that function will be used to transform all non-excluded argument values
  • transformers={index/name -> callable}, where any positional argument whose index is in the map or keyword argument whose name is in the map will be transformed by the corresponding function.

Example, if a run is logged with binary_search([1, 2, 3, 4], element=2, fallback=None), then the call to determine the curve should be .best_fit_curve(exclude={"element", "fallback"}, transformers={0: len}). This will determine the best-fit-curve of runtime against the length of the list being searched.

Note: Runs in the most recent split are used for determining the best-fit-curve by default, but a split can be specified by passing its index or label as split_index=.

@timer.decorate(runs=100, iterations_per_run=5, log_arguments=True, call_callable_args=True)
def binary_search(sorted_array, element, fallback=-1):
    lower, upper = 0, len(sorted_array)
    middle = upper // 2

    while middle >= lower and middle != upper:
        if element == sorted_array[middle]:
            return middle
        elif element > sorted_array[middle]:
            lower = middle + 1  # lower must be beyond middle because the middle wasn't right
        else:
            upper = middle - 1  # upper must be lower than the middle because the middle wasn't right

        middle = (upper + lower) // 2

    return fallback  # couldn't find it

binary_search(lambda: [i for i in range(randint(0, 10000))], element=lambda: randint(0, 10000), fallback=None)
curve = timer.best_fit_curve(exclude={"element", "fallback"}, transformers=len)
print(curve)
# ('Logarithmic', {'a': 5.0942363348944415e-06, 'b': 1.6375445141153575e-06})

The result of .best_fit_curve() will be a tuple of the name of the best-fit-curve, from the list above, and a map of the parameters determined for the best-fit-curve. The time unit of any returned values is seconds.

@timer.decorate(runs=100, iterations_per_run=10, call_callable_args=True, log_arguments=True)
def factorial(n):
    if n == 1:
        return 1
    else:
        return n * factorial.__wrapped__(n-1)

factorial(lambda: randint(1, 100))
timer.plot(plot_curve=True, time_unit=timer.US, equation_rounding=5)

Imgur image

As you can see, factorial is almost exactly linear, but a polynomial curve apparently fit slightly better. However, a specific curve type can be forced by setting curve_type in the call to .best_fit_curve().

timer.plot(plot_curve=True, curve=timer.best_fit_curve(curve_type="Linear"), time_unit=timer.US, equation_rounding=5)

Imgur image

Clone this wiki locally