Contents

  1. Introduction
  2. Global Analysis
  3. Math Models
  4. Dct Example
  5. Software


Math Models for Protein / DNA Binding Systems

This document describes the use of NONLIN to analyze DNAse1 footprint data. To use NONLIN, one writes a fortran module describing the relevant model, and compiles and links this module with those of NONLIN to create an executable program for processing a given data set. Lahey fortran F77L, whose computation routines have been tested for use with NONLIN on a Dos system, was purchased from Lahey Computer Systems (Incline Village, Nevada, 800-548-4778) for compiling and linking operations. (Identical results were obtained when the executable program was run on i386, i486 and pentium CPUs.) To analyze DNAse 1 footprint data, the fortran module must describe the dependent variable, the observed protection against nicking by DNAse 1, as a function of the independent variable, the active concentration of protein ligand. To accommodate the different experiments included in a global analysis, the "model" will contain different functions as needed. One needs to bear in mind the following concepts to understand how the module for analyzing DNAse 1 footprint data works.

  1. Fractional protection is linearly related to fractional occupancy. This can be stated as P = (U- L)Y + L, where P is fractional protection, Y is fractional occupancy, and U and L are upper and lower limits of the measure of protection (radioactive emissions from a gel or optical density of an autoradiogram, for example). Because it is always best to treat raw data rather than transformed data in a nonlinear regression analysis, the fortran module must include these limits as parameters to be determined during the anlaysis.
  2. Fractional occupancy can be modeled as a function of free ligand concentration and Gibbs free energies of the various states in which a given site is occupied. Given the relationship between Gibbs free energy (dG) and the fractional probability of each state (fs):

    one obtains the equation for fractional occupancy by adding together the fractional probability functions for all states in which a given site is occupied and dividing by the total sum of all fractional probabiliites. Note that state 0, in which no lignads are bound, is taken as the reference state whose dG value is arbitrarily set equal to zero. The dG values for the remaining states are comprised of the reference state value plus any other relevant free energies. For example, in a three site system state 4 has sites 1 and 2 occupied. Its free energy components are the reference state plus energies of interaction between the protein and the two DNA sites (the intrinsic binding energies dG1 and dG2) and energy of interaction between the two protein DNA complexes (the cooperative energy dG12). When all three sites are filled, in state 7, the relevant free energy is dG7 = dG0+ dG1+ dG2+ dG3+ dG123. Table 1 illustrates the various states that exist in models for 1, 2, or 3 site systems.

    Replacing dG's in the relevant fs's with the appropriate combination of dG1, dG2, dG3 and cooperative dG's in the fractional occupancy equations of Table 1 results in the desired model. Fractional protection at a given site is thus modeled by expressing the fs terms in the appropriate equations for fractional occupancy in terms of dG1, dG2, dG3, dG12, dG13, dG23, and dG123 and using the resulting equation for fractional occupancy to describe fractional protection as a function of the free energy parameters and the titration limits L and U. Creating and applying the appropriate equation to a data describing the titration of a given site in a given experiment is one of the major tasks of the fortran module.

  3. Finally, the concentration of free ligand may only be a subset of the total protein concentration if, over the concentration relevant for the binding study, the protein partitions between different forms that have different binding capacities (monomers and dimers, for example), and/or if some of the protein has become inactivated during the purification procedures. As a convenience for the user, the fortran module provides a routine for converting input protein concentration (as total [monomer]) to its equilibrium values for monomer and dimer species, provided the equilibrium constant is known. If this value is not known, the user is allowed to input protein concentrations in absolute amounts, which means that the analysis is based on the assumption that only a single species of protein exits.

Summary

This means that the fortran module must be able to recognize different DNA template structures, derive the relevant equations relating binding energies, fractional occupancy and titration limits to fractional protection at each site in each experiment, and, if needed, calculate free ligand concentration given a dissociation constant and total (functionally competent) protein concentration.

Prior Chapter

Next Chapter