Protein-DNA interactions play a key role in the regulation of gene expression and damage repair. Understanding the factors that govern the affinity and specificity of these interactions is hence of great importance. In order to gain insight into these factors we work on classifying/clustering DNA binding domains in eukaryotes into smaller groups (sub-families) on the basis of the information contained in their amino acid sequence. Obtained classifications and characteristic sequence signatures are mapped onto the known 3D structures of these proteins and their complexes with DNA in order to rationalize the role of specific residues in binding. We recently started a collaboration with colleagues at the University of Toronto and Harvard Medical School Boston on relating our protein sequence-based classifications to DNA binding specificities derived from high throughput protein binding microarrays (PBMs).
Some of our older work on protein DNA interactions on the analysis of general properties of interfaces in known protein-DNA complexes is described in:
Stereo view of the HOX
homeodomain 1ig7 in contact with the DNA. In stick mode are key residues
of the homeodomain family: blue ones are conserved residues while the
others correspond to the set of 9 specificty determining positions (SDP)
making contact with the DNA. The contribution of all the SDP to the
interface with the DNA is on average a significant 23%, while that of
the conserved residues adds up to around 15%. In comparison the
N-terminal coil has a contribution that in terms of relative interface
adds up to 42%| 40 | GMD03350_1 | ------FSSFQRKGLEIQF--QQQKYITKPDRRKLAARL--NLTD--AQVKVWFQNRRMKWR- |
| 41 | GAP01770_1 | ------FTNHQIYELEKRF--LYQKYLSPADRDQIAQQL--GLTN--AQVITWFQNRRAKLKR |
| 42 | SSF35481_1 | ------FTDHQLAQLERSF--ERQKYLSVQDRMELAASL--NLTD--TQVKTWYQNRR----- |
| 43 | BMD57173_1 | ------FTDHQLQTLEKSF--ERQKYLSVQDRMELAAKL--GLTD--TQVKTWYQNRRTKWKR |
| 44 | SSF01710_1 | ------FTELQLMGLEKRF--EKQKYLSTPDRIDLAECL--DLSQ--LQVKTWYQNRRMKWKK |
| 45 | MCP03763_1 | ------FSDQQLQGLEQRF--NGQKYLSTPERISLAESL--HLSE--TQVKTWFQNRRMK--- |
| 46 | CVP02277_1 | ------FSDQQLNGLEKRF--EAQRYLSTPERVELANQL--SLSE--TQVKTWFQNRRMKHKK |
| 47 | BMD56378_1 | ------FTSEQLLELEREF--HAKKYLSLTERSQIAAAL--KLSE--VQVKIWFQNRRAKWKR |
| 48 | IPD17949_1 | ------FTHLQVLELEKKF--SRQRYLSAPERAHLASAL--RLTE--TQVKIWFQNRRYKTKR |
| 49 | CCE05724_1 | ------FTTSQLLVLERKF--LQKQYLSIAERAEFSNSL--NLTE--TQVKIWFSNTRAKAKR |
| 50 | BME03454_1 | ------FTTQQLLALERKF--RVKQYLSIAERAEFSSSL--NLTE--TQVKIWFQNRRAKEKR |
| 51 | 1ig7A | RKPRTPFTTAQLLALERKF--RQKQYLSIAERAEFSSSL--SLTE--TQVKIWFQNRRAKAKR |
| 52 | SPE19093_1 | ------FSGRQIFELEKQF--EVKKYLSASERAELASLL--NVTD--TQVKIWFQNRRTKWKK |
| 53 | SSF54974_1 | ------FSKRQIFQLESTF--DMKRYLSSAERACLASSL--QLTE--TQVKIWFQNRRNKLKR |
| 54 | SSP03100_1 | ------FSRHQVSQLEMTF--DMKRYLSSQERAHLASNL--QLTE--TQVKIWFQNRRNKWKR |
| 55 | SPE27538_1 | ------FSRSQVFQLESTF--EVKRYLSSSERAGLAANL--HLTE--TQVKIWFQNRRNKWKR |
| 56 | SSF22556_1 | ------FSRVQICELEKRF--HRQKYLASAERATLAKSL--KMTD--AQVKTWFQNRRTKWRR |
| 57 | SPE11478_1 | ------FSNDQTMELEKKF--ENQKYLSPPERKKLAKVL--QLSE--RQVKTWFQNRRAKWRR |
| 58 | SPE27764_1 | ------FTREQIGRLEKEF--ARENYVSRPKRCELATAL--NLPE--TTIKVWFQNRRMKDKR |
| 59 | 1jggA | -RYRTAFTRDQLGRLEKEF--YKENYVSRPRRCELAAQL--NLPE--STIKVWFQNRRMKDKR |
| 60 | CSH13076_1 | ------FTHEQVRQLELDF--SENHYLTRLRRYELSLKL--SLTE--RQIKVWFQNRRMKLKR |
| 61 | SPE28572_1 | ------FTKEQIRELENEF--NHHNYLTRLRRYEIAVTL--NLTE--RQVKVWFQNRRMKWKR |
| 62 | CCE07325_1 | ------FTKEQIRELESEF--AHHNYLTRLRRYEIAVNL--DLTE--RQVKVWFQNRRMKWKR |
| 78 | SRP01584_1 | ------FTTHQLTELEKEY--YTSKYLDRSRRREIAKQL--ALNE--TQVKIWFQNRRMKEKK |
| 79 | sp_P09022 _HXA1_MOUSE | ---RTNFTTKQLTELEKEF--HFNKYLTRARRVEIAASL--QLNE--TQVKIWFQNRRMKQKK |
| 80 | 1b72A | --LRTNFTTRQLTELEKEF--HFNKYLSRARRVEIAATL--ELNE--TQVKIWFQNRRMKQKK |
| 92 | HRP00014_1 | ------FTPEQLERLEREF--LKQQYMVGTERFYLAKEL--NLGE--AQVKVWFQNRRIKWRK |
| 93 | ASP20371_1 | ------FTPTQADTLEKEY--LTDQYMPRTRRILIAESL--GLSE--GQVKTWFQNRRAKEKR |
| 94 | TCP00488_1 | ------FTPAQADTLEKEY--LTDQYMPRTRRILIAESL--GLNE--GQVKTWFQNRRAKEKR |
| 95 | BMD20907_1 | ------FTGDQQLRLEQTL--EKTQYINGTDRRELAQKW--GIGE--KGIKIWFQNRRMKNKR |
| 96 | BMD49473_1 | ------FTTEQINYLENEF--KKSHYISAVQRKEIANIV--NVPE--KVIKIWFQNRRMREKK |