Dose response curve#

Introduction#


A dose-response curve is a curve that shows the relationship between the dose of a drug administered (\([Drug]\) (ng ml\(^{-1}\)) or \(\mu\)\(M\)) and its pharmacological effect (Response).

When the relation between drug dose (X-axis) and drug response (Y-axis) is plotted on a base 10 logarithmic scale, this produces a sigmoidal curve. This representation is more useful than a linear plot because it expands the dose scale in the region where drug response is changing rapidly and compresses the scale at higher doses where large changes have little effect on response.

Nonlinear regression can determine a drug’s potency (i.e. the concentration that gives half-maximal response, \(EC_{50}\)). The Hill Equation or 4 parameter logistic model is a standard model often used in dose-response curve analysis:

\[ Response = R_{min} + \frac{R_{max} - R_{min}}{1 + (\frac{10^{Log_{10}(EC_{50})}}{10^{x}})^{nHill}} \]

Where \(Response\) is the measured signal, \(x\) is the log of drug dose or concentration, \(EC_{50}\) is the relative 50% effective dose or concentration, \(n_{Hill}\) is the Hill exponent and describes the steepness of the curve, \(R_{max}\) is the maximum effect and \(R_{min}\) is the effect in the absence of drug.

Dose response curve model

See here for more information.

Data#


Get the data needed for this exercise here.

The spreadsheet “DoseResponseCurveAssay.xlsx” contains two independent replicates. These are two independent experiments and - as we can see - the drug administered doses differ between experiments. As a result, we cannot calculate average response values.

Dose response curve assay data

Data analysis#


Exercise 38

Import the libraries needed. Use convenient naming.

Exercise 39

Read in the data containing \([Drug]\) (ng ml\(^{-1}\)) and response from the Excel file into a Python pandas DataFrame.

Exercise 40

Calculate \(log_{10}\)\([Drug]\).

Tip: use the numpy.log10 command to calculate \(log_{10}\).

Exercise 41

Plot the data: log\([Drug]\) versus response.

Inspect the data!

  • Do we discern a clear trend in our data?

    • Do the data show a positive (sloping upward), negative (sloping downward), or no (spread out) correlation?

    • Do we notice a linear or a non-linear relationship between x- and y-values?

  • Do we have outliers?

    • Where the values entered correctly?

    • Where there any experimental errors? E.g. a calculation error that we picked up afterwards when looking at our lab notebook?

    • Are the data points a mistake? E.g. a pipetting error?

In this example, we do not have plenty replicate data points at each value of x so we cannot use statistical tests, e.g. the Grubbs test (see here for more information and an online Grubbs test calculator), on each set of replicates to determine whether a data point is a significant outlier from the rest.

We often only measure a data point once or twice, like in this example. Useful tools in this case are:

  • studentized residuals, which look at the residuals calculated from the experimental data and model data. See here for more information.

  • robust techniques which use median, rather than mean, values. See here for more information.

  • Cook’s distance, which measures the influence of each data point (of note, an influential point is not always an outlier!), thereby determining how much predicted values would change if that point were deleted. See here for more information.

  • ROUT (Robust regression and OUTlier removal) for non-linear regression. See the article by Motulsky HJ and Brown RE (2006), available here, for more information.

Exercise 42

Define the Hill function to fit the data.

Exercise 43

Fit all data using the Hill function.

Tips:

  • Use the pandas.concat function to combine columns.

  • Look at the graph to find good guesses for the model parameters (see figure above).

Exercise 44

Report the fit parameters and standard errors on the fit parameters: \(R_{min}\), \(R_{max}\), \(EC_{50}\), and \(n_{Hill}\).

Exercise 45

Calculate the residuals.

Exercise 46

Produce a combined figure showing the residuals plot underneath the main plot with data and fitted curve. Make sure they are aligned and have the same X-axis so we can see which residual corresponds to which data point.

Inspect the quality of the fit!

  • Look at the graph of the experimental data and the fitted curve Do the experimental data and model match?

  • Look at the graph of the residuals. Are they around 0? Are they random or is there a trend? If the residuals display a systematic pattern, the model fits the data poorly.

  • Look at the fit parameters and the standard errors on the fit parameters. Are the fit parameters within (biological) reason? Are the standard errors on the fit parameters small? If a standard error on a fit parameter is bigger than the fit parameter, it is possible that there are not enough data points or that the model fits the data poorly.

  • Look at the goodness of fit statistics. But be careful! For example, R-square, ranging from 0 (worst possible fit) to 1 (best possible fit), compares the fit of your model to the fit of a horizontal line through the mean of all Y values, which is valid for linear regression, but not for non-linear regression. For those reasons, these fit statistics are not readily available as output of the SciPy curve_fit() function…