Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
6  structures 1  species 0  interactions 1  sequence 1  architecture

Protein: POLG_WNV (P06935)

Summary

This is the summary of UniProt entry POLG_WNV (P06935).

Description: Genome polyprotein EC=3.4.21.91 EC=3.6.1.15 EC=3.6.4.13 EC=2.1.1.56 EC=2.1.1.57 EC=2.7.7.48
Source organism: West Nile virus (WNV) (NCBI taxonomy ID 11082)
View Pfam proteome data.
Length: 3430 amino acids

Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.

Pfam domains

Source Domain Start End
disorder n/a 1 15
Pfam A Flavi_capsid 6 123
transmembrane n/a 21 38
transmembrane n/a 44 66
low_complexity n/a 46 60
disorder n/a 95 97
transmembrane n/a 107 126
Pfam A Flavi_propep 128 214
low_complexity n/a 208 216
disorder n/a 215 216
Pfam A Flavi_M 216 290
transmembrane n/a 250 269
transmembrane n/a 276 292
Pfam A Flavi_glycoprot 291 585
low_complexity n/a 405 416
disorder n/a 522 527
Pfam A Flavi_glycop_C 587 685
disorder n/a 609 610
low_complexity n/a 714 725
transmembrane n/a 739 760
transmembrane n/a 767 787
Pfam A Flavi_NS1 789 1143
disorder n/a 922 924
disorder n/a 932 935
disorder n/a 1040 1052
disorder n/a 1078 1083
disorder n/a 1085 1086
transmembrane n/a 1144 1163
Pfam A Flavi_NS2A 1151 1370
transmembrane n/a 1175 1193
transmembrane n/a 1213 1231
transmembrane n/a 1243 1260
low_complexity n/a 1245 1256
transmembrane n/a 1280 1301
transmembrane n/a 1308 1326
transmembrane n/a 1338 1362
Pfam A Flavi_NS2B 1374 1501
transmembrane n/a 1374 1392
transmembrane n/a 1398 1416
transmembrane n/a 1473 1494
disorder n/a 1509 1511
Pfam A Peptidase_S7 1517 1669
disorder n/a 1601 1604
disorder n/a 1610 1615
disorder n/a 1628 1629
disorder n/a 1667 1668
disorder n/a 1673 1681
Pfam A Flavi_DEAD 1685 1832
disorder n/a 1819 1820
disorder n/a 1825 1838
Pfam A Helicase_C 1878 1967
disorder n/a 1956 1958
disorder n/a 1961 1963
disorder n/a 1973 1976
disorder n/a 2022 2023
Pfam A Flavi_NS4A 2123 2267
transmembrane n/a 2170 2191
low_complexity n/a 2195 2208
transmembrane n/a 2200 2218
transmembrane n/a 2224 2241
low_complexity n/a 2227 2239
transmembrane n/a 2253 2270
Pfam A Flavi_NS4B 2270 2519
transmembrane n/a 2309 2326
transmembrane n/a 2347 2370
transmembrane n/a 2376 2395
low_complexity n/a 2376 2388
transmembrane n/a 2443 2462
disorder n/a 2564 2565
disorder n/a 2567 2577
Pfam A FtsJ 2579 2750
disorder n/a 2625 2634
disorder n/a 2777 2780
Pfam A Flavi_NS5 2778 3426
disorder n/a 2790 2792
disorder n/a 2794 2799
low_complexity n/a 2860 2874
disorder n/a 2866 2870
disorder n/a 2873 2874
disorder n/a 2876 2889
disorder n/a 2892 2897
disorder n/a 3118 3120
disorder n/a 3159 3164

Show or hide domain scores.

Sequence annotations

This section shows a graphical representation of this sequence, with Pfam domains shown in the standard Pfam format. Under the Pfam domain image we show various tracks, illustrating features on this sequence that we found in other databases. You can choose which databases to include using the drop-down panel under the image. More...

Note: it can take a few seconds for this image to be generated and loaded.

Loading feature alignment...

Show sources update panel.

Sequence information

This is the amino acid sequence of the UniProt sequence database entry with the accession P06935. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.

Sequence:
1
MSKKPGGPGK NRAVNMLKRG MPRGLSLIGL KRAMLSLIDG KGPIRFVLAL
50
51
LAFFRFTAIA PTRAVLDRWR GVNKQTAMKH LLSFKKELGT LTSAINRRST
100
101
KQKKRGGTAG FTILLGLIAC AGAVTLSNFQ GKVMMTVNAT DVTDVITIPT
150
151
AAGKNLCIVR AMDVGYLCED TITYECPVLA AGNDPEDIDC WCTKSSVYVR
200
201
YGRCTKTRHS RRSRRSLTVQ THGESTLANK KGAWLDSTKA TRYLVKTESW
250
251
ILRNPGYALV AAVIGWMLGS NTMQRVVFAI LLLLVAPAYS FNCLGMSNRD
300
301
FLEGVSGATW VDLVLEGDSC VTIMSKDKPT IDVKMMNMEA ANLADVRSYC
350
351
YLASVSDLST RAACPTMGEA HNEKRADPAF VCKQGVVDRG WGNGCGLFGK
400
401
GSIDTCAKFA CTTKATGWII QKENIKYEVA IFVHGPTTVE SHGKIGATQA
450
451
GRFSITPSAP SYTLKLGEYG EVTVDCEPRS GIDTSAYYVM SVGEKSFLVH
500
501
REWFMDLNLP WSSAGSTTWR NRETLMEFEE PHATKQSVVA LGSQEGALHQ
550
551
ALAGAIPVEF SSNTVKLTSG HLKCRVKMEK LQLKGTTYGV CSKAFKFART
600
601
PADTGHGTVV LELQYTGTDG PCKVPISSVA SLNDLTPVGR LVTVNPFVSV
650
651
ATANSKVLIE LEPPFGDSYI VVGRGEQQIN HHWHKSGSSI GKAFTTTLRG
700
701
AQRLAALGDT AWDFGSVGGV FTSVGKAIHQ VFGGAFRSLF GGMSWITQGL
750
751
LGALLLWMGI NARDRSIAMT FLAVGGVLLF LSVNVHADTG CAIDIGRQEL
800
801
RCGSGVFIHN DVEAWMDRYK FYPETPQGLA KIIQKAHAEG VCGLRSVSRL
850
851
EHQMWEAIKD ELNTLLKENG VDLSVVVEKQ NGMYKAAPKR LAATTEKLEM
900
901
GWKAWGKSII FAPELANNTF VIDGPETEEC PTANRAWNSM EVEDFGFGLT
950
951
STRMFLRIRE TNTTECDSKI IGTAVKNNMA VHSDLSYWIE SGLNDTWKLE
1000
1001
RAVLGEVKSC TWPETHTLWG DGVLESDLII PITLAGPRSN HNRRPGYKTQ
1050
1051
NQGPWDEGRV EIDFDYCPGT TVTISDSCEH RGPAARTTTE SGKLITDWCC
1100
1101
RSCTLPPLRF QTENGCWYGM EIRPTRHDEK TLVQSRVNAY NADMIDPFQL
1150
1151
GLMVVFLATQ EVLRKRWTAK ISIPAIMLAL LVLVFGGITY TDVLRYVILV
1200
1201
GAAFAEANSG GDVVHLALMA TFKIQPVFLV ASFLKARWTN QESILLMLAA
1250
1251
AFFQMAYYDA KNVLSWEVPD VLNSLSVAWM ILRAISFTNT SNVVVPLLAL
1300
1301
LTPGLKCLNL DVYRILLLMV GVGSLIKEKR SSAAKKKGAC LICLALASTG
1350
1351
VFNPMILAAG LMACDPNRKR GWPATEVMTA VGLMFAIVGG LAELDIDSMA
1400
1401
IPMTIAGLMF AAFVISGKST DMWIERTADI TWESDAEITG SSERVDVRLD
1450
1451
DDGNFQLMND PGAPWKIWML RMACLAISAY TPWAILPSVI GFWITLQYTK
1500
1501
RGGVLWDTPS PKEYKKGDTT TGVYRIMTRG LLGSYQAGAG VMVEGVFHTL
1550
1551
WHTTKGAALM SGEGRLDPYW GSVKEDRLCY GGPWKLQHKW NGHDEVQMIV
1600
1601
VEPGKNVKNV QTKPGVFKTP EGEIGAVTLD YPTGTSGSPI VDKNGDVIGL
1650
1651
YGNGVIMPNG SYISAIVQGE RMEEPAPAGF EPEMLRKKQI TVLDLHPGAG
1700
1701
KTRKILPQII KEAINKRLRT AVLAPTRVVA AEMSEALRGL PIRYQTSAVH
1750
1751
REHSGNEIVD VMCHATLTHR LMSPHRVPNY NLFIMDEAHF TDPASIAARG
1800
1801
YIATKVELGE AAAIFMTATP PGTSDPFPES NAPISDMQTE IPDRAWNTGY
1850
1851
EWITEYVGKT VWFVPSVKMG NEIALCLQRA GKKVIQLNRK SYETEYPKCK
1900
1901
NDDWDFVITT DISEMGANFK ASRVIDSRKS VKPTIIEEGD GRVILGEPSA
1950
1951
ITAASAAQRR GRIGRNPSQV GDEYCYGGHT NEDDSNFAHW TEARIMLDNI
2000
2001
NMPNGLVAQL YQPEREKVYT MDGEYRLRGE ERKNFLEFLR TADLPVWLAY
2050
2051
KVAAAGISYH DRKWCFDGPR TNTILEDNNE VEVITKLGER KILRPRWADA
2100
2101
RVYSDHQALK SFKDFASGKR SQIGLVEVLG RMPEHFMVKT WEALDTMYVV
2150
2151
ATAEKGGRAH RMALEELPDA LQTIVLIALL SVMSLGVFFL LMQRKGIGKI
2200
2201
GLGGVILGAA TFFCWMAEVP GTKIAGMLLL SLLLMIVLIP EPEKQRSQTD
2250
2251
NQLAVFLICV LTLVGAVAAN EMGWLDKTKN DIGSLLGHRP EARETTLGVE
2300
2301
SFLLDLRPAT AWSLYAVTTA VLTPLLKHLI TSDYINTSLT SINVQASALF
2350
2351
TLARGFPFVD VGVSALLLAV GCWGQVTLTV TVTAAALLFC HYAYMVPGWQ
2400
2401
AEAMRSAQRR TAAGIMKNVV VDGIVATDVP ELERTTPVMQ KKVGQIILIL
2450
2451
VSMAAVVVNP SVRTVREAGI LTTAAAVTLW ENGASSVWNA TTAIGLCHIM
2500
2501
RGGWLSCLSI MWTLIKNMEK PGLKRGGAKG RTLGEVWKER LNHMTKEEFT
2550
2551
RYRKEAITEV DRSAAKHARR EGNITGGHPV SRGTAKLRWL VERRFLEPVG
2600
2601
KVVDLGCGRG GWCYYMATQK RVQEVKGYTK GGPGHEEPQL VQSYGWNIVT
2650
2651
MKSGVDVFYR PSEASDTLLC DIGESSSSAE VEEHRTVRVL EMVEDWLHRG
2700
2701
PKEFCIKVLC PYMPKVIEKM ETLQRRYGGG LIRNPLSRNS THEMYWVSHA
2750
2751
SGNIVHSVNM TSQVLLGRME KKTWKGPQFE EDVNLGSGTR AVGKPLLNSD
2800
2801
TSKIKNRIER LKKEYSSTWH QDANHPYRTW NYHGSYEVKP TGSASSLVNG
2850
2851
VVRLLSKPWD TITNVTTMAM TDTTPFGQQR VFKEKVDTKA PEPPEGVKYV
2900
2901
LNETTNWLWA FLARDKKPRM CSREEFIGKV NSNAALGAMF EEQNQWKNAR
2950
2951
EAVEDPKFWE MVDEEREAHL RGECNTCIYN MMGKREKKPG EFGKAKGSRA
3000
3001
IWFMWLGARF LEFEALGFLN EDHWLGRKNS GGGVEGLGLQ KLGYILKEVG
3050
3051
TKPGGKVYAD DTAGWDTRIT KADLENEAKV LELLDGEHRR LARSIIELTY
3100
3101
RHKVVKVMRP AADGKTVMDV ISREDQRGSG QVVTYALNTF TNLAVQLVRM
3150
3151
MEGEGVIGPD DVEKLGKGKG PKVRTWLFEN GEERLSRMAV SGDDCVVKPL
3200
3201
DDRFATSLHF LNAMSKVRKD IQEWKPSTGW YDWQQVPFCS NHFTELIMKD
3250
3251
GRTLVVPCRG QDELIGRARI SPGAGWNVRD TACLAKSYAQ MWLLLYFHRR
3300
3301
DLRLMANAIC SAVPANWVPT GRTTWSIHAK GEWMTTEDML AVWNRVWIEE
3350
3351
NEWMEDKTPV ERWSDVPYSG KREDIWCGSL IGTRTRATWA ENIHVAINQV
3400
3401
RSVIGEEKYV DYMSSLRRYE DTIVVEDTVL                      
3430
 

Show the unformatted sequence.

Checksums:
CRC64:42D71B7CB12DC45B
MD5:c500693d89d1b6d2bda240a4e4fd5d28

Structures

For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the MSD group, to allow us to map Pfam domains onto UniProt three-dimensional structures. The table below shows the mapping between Pfam domains, this UniProt entry and a corresponding three dimensional structure.

Pfam family UniProt residues PDB ID PDB chain ID PDB residues View
Flavi_glycop_C 587 - 685 2P5P A 301 - 399 Jmol AstexViewer SPICE
B 301 - 399 Jmol AstexViewer SPICE
C 301 - 399 Jmol AstexViewer SPICE
Flavi_glycoprot 297 - 583 3I50 E 7 - 297 Jmol AstexViewer SPICE
Flavi_NS2B 1419 - 1459 2IJO A 49 - 89 Jmol AstexViewer SPICE
1420 - 1458 2FP7 A 50 - 88 Jmol AstexViewer SPICE
1420 - 1459 3E90 C 50 - 89 Jmol AstexViewer SPICE
1420 - 1463 2GGV A 50 - 93 Jmol AstexViewer SPICE
1421 - 1463 3E90 A 51 - 93 Jmol AstexViewer SPICE
Peptidase_S7 1517 - 1669 2GGV B 16 - 168 Jmol AstexViewer SPICE
2IJO B 16 - 168 Jmol AstexViewer SPICE
3E90 B 16 - 168 Jmol AstexViewer SPICE
D 16 - 168 Jmol AstexViewer SPICE
1520 - 1669 2FP7 B 19 - 168 Jmol AstexViewer SPICE