Package pdb

import "github.com/TuftsBCB/io/pdb"
Overview
Index
Examples
Subdirectories

Overview ▾

Constants

const (
    SeqProtein = iota
    SeqDeoxy
    SeqRibo
)

func RMSD

func RMSD(entry1 *Entry, chainId1 byte, start1, end1 int,
    entry2 *Entry, chainId2 byte, start2, end2 int) (float64, error)

RMSD is a convenience function for computing the RMSD between two sets of residues, where each set is take from a chain of a PDB entry. Note that RMSD is only computed using carbon-alpha atoms.

Each set of atoms to be used is specified by a four-tuple: a PDB entry file, a chain identifier, and the start and end residue numbers to use as a range. (Where the range is inclusive.)

An error will be returned if: chainId{1,2} does not correspond to a chain in entry{1,2}. The ranges specified by start{1,2}-end{1,2} are not valid. The ranges specified by start{1,2}-end{1,2} do not correspond to precisely the same number of carbon-alpha atoms.

func RMSDChains

func RMSDChains(chain1 *Chain, start1, end1 int,
    chain2 *Chain, start2, end2 int) (float64, error)

RMSDChains is the same as RMSD, except it uses *Chain values directly.

type Atom

type Atom struct {
    Name string
    Het  bool
    structure.Coords
}

type Chain

type Chain struct {
    Entry    *Entry
    Ident    byte
    SeqType  SequenceType
    Sequence []seq.Residue
    Models   []*Model
    Missing  []*Residue
}

func (*Chain) AsSequence

func (c *Chain) AsSequence() seq.Sequence

AsSequence returns the chain as a sequence with an appropriate name. (e.g., 1tcfA)

func (Chain) CaAtoms

func (c Chain) CaAtoms() []structure.Coords

CaAtoms returns all alpha-carbon atoms in the chain. If there is more than one model, only the first model is used.

func (Chain) IsProtein

func (c Chain) IsProtein() bool

IsProtein returns true if the chain consists of amino acids.

IsProtein also returns true if there are no SEQRES records.

func (Chain) SequenceAtoms

func (c Chain) SequenceAtoms() []*Residue

SequenceAtoms returns a slice of all residues for the chain in correspondence with the sequence in SEQRES (automatically using the first model). Namely, the mapping is sparse, since not all SEQRES residues have an ATOM record.

See Model.SequenceCaAtoms for the deets.

func (Chain) SequenceCaAtomSlice

func (c Chain) SequenceCaAtomSlice(start, end int) []structure.Coords

SequenceCaAtomSlice attempts to extract a contiguous slice of alpha-carbon ATOM records based on *residue* index. Namely, if a contiguous slice cannot be found, nil is returned. If there is more than one model, the first model is used.

func (Chain) SequenceCaAtoms

func (c Chain) SequenceCaAtoms() []*structure.Coords

SequenceCaAtoms returns a slice of all Ca atoms for the chain in correspondence with the sequence in SEQRES (automatically using the first model).

See Model.SequenceCaAtoms for the deets.

type Entry

type Entry struct {
    Path   string
    IdCode string
    Chains []*Chain

    // SCOP is set whenever we see an identifier that looks like a
    // SCOP id. We use this to determine how to satisfy the Bower interface,
    // so that each entry has a unique ID.
    // Similarly for CATH.
    Scop string
    Cath string
}

func Read

func Read(reader io.Reader, fpath string) (*Entry, error)

func ReadPDB

func ReadPDB(fp string) (*Entry, error)

Example

Code:

entry := readPDB()
fmt.Printf("%s\n", entry.Chains[1].Sequence)

res := entry.Chains[0].Models[1].Residues[0]
atom := res.Atoms[1]
fmt.Printf("%s %c %0.3f %0.3f %0.3f\n",
    atom.Name, res.Name, atom.X, atom.Y, atom.Z)

Output:

AYIGPYL
CA S -18.866 9.770 -5.303

func (*Entry) Chain

func (entry *Entry) Chain(ident byte) *Chain

Chain returns a chain with the given identifier. If such a chain does not exist, nil is returned.

func (*Entry) OneChain

func (entry *Entry) OneChain() *Chain

OneChain returns a single chain in the PDB file. If there is more than one chain, OneChain will panic. This is convenient when you expect a PDB file to have only a single chain, but don't know the name.

type Model

type Model struct {
    Entry    *Entry
    Chain    *Chain
    Num      int
    Residues []*Residue
}

func (Model) CaAtoms

func (m Model) CaAtoms() []structure.Coords

CaAtoms returns all alpha-carbon atoms in the model. This includes multiple alpha-carbon atoms belonging to the same residue. It does not include HETATMs.

func (Model) SequenceAtoms

func (m Model) SequenceAtoms() []*Residue

SequenceAtoms is just like SequenceCaAtoms, except it returns the residues instead of the alpha-carbon coordinates directly. The advantage here is to get a mapping that isn't limited by the presence of alpha-carbon atoms.

See SequenceCaAtoms for the deets.

func (Model) SequenceCaAtomSlice

func (m Model) SequenceCaAtomSlice(start, end int) []structure.Coords

SequenceCaAtomSlice attempts to extract a contiguous slice of alpha-carbon ATOM records based on *residue* index. Namely, if a contiguous slice cannot be found, nil is returned.

func (Model) SequenceCaAtoms

func (m Model) SequenceCaAtoms() []*structure.Coords

SequenceCaAtoms returns a slice of all Ca atoms for the model in correspondence with the sequence in SEQRES. Note that a slice of pointers is returned, since not all residues necessarily correspond to a alpha-carbon ATOM.

This method can proceed in one of two ways. First, if "REMARK 465" is present in the PDB file, it will be used to determine the positions of the holes in the sequence (i.e., residues in SEQRES without an ATOM record). This method is generally reliable, since REMARK 465 lists all residues in SEQRES that don't have an ATOM record. This will fail if there are any unreported missing residues.

If "REMARK 465" is absent, then we have to rely on the order of ATOM records to correspond to a residue index in the SEQRES sequence. This will fail with an error if there are any unreported missing residues.

Generally, false positives are limited by returning errors if corruption is detected. However, false positives can be returned in pathological cases (like long strings of low complexity regions or UNKNOWN amino acids), but they are rare. Probably on the order of a handful in the entire PDB.

In sum, a list of atom pointers is returned with length equal to the number of residues in the SEQRES record for this model. Some pointers may be nil.

type Residue

type Residue struct {
    Name          seq.Residue
    SequenceNum   int
    InsertionCode byte
    Atoms         []Atom
}

func (Residue) Ca

func (r Residue) Ca() (structure.Coords, bool)

Ca returns the alpha-carbon atom in this residue. If one does not exist, nil is returned.

type SequenceType

type SequenceType int

func (SequenceType) String

func (typ SequenceType) String() string

Subdirectories

Name      Synopsis
..
slct