RedLevel stand for redundancy level. GenBanK and RefSeq assembly have a unique identifier but the same assembly can be found in GenBanK and/or RefSeq. One assembly can have mulitple version (including minor change in their annotation for example). Here, several levels were defined to overcome the redundancy implied by assembly versionning and the database duo GenBank-RefSeq.
There are 4 levels of redundancy :
This section show the percentage of genome entry at different level of redundancy with or without ccyA.
The chart below show the increasing number of genome at different level of redundancy over the time.
The lineplot below show the total number of calcyanin sequences over the time for the higher level of redundancy. Therefore there might be duplicated sequences due to GenBank/RefSeq versionning and assembly versionning.
Sunburst and treemap chart display the same type of data. They show the number of sequence by categories in a hierarchical way. Starting from the N-ter type to the date of analysis. If you click on a specific area you will see the number of sequences for each sub-catergories.
The decision tree below is used to classify sequences with a significative match against the GlyX3 HMM profile. Red and green edges indicate respectively negative and positive answers. Shortly, for sequences with a match against the GlyX3 HMM profile, we look at the presence and order on the sequence of each Glycine Zipper and we use a set of known N-ter to infer the nature of the N-ter extremity of those sequences. Finaly a label is assign for each of them depending on their modular organization.