import React from 'react'

import { Container, Divider, Grid } from 'semantic-ui-react'

import './index.css'

class About extends React.Component {
  render () {
    return (
      <Grid centered>
        <Grid.Row>
          <h2>Frequently Asked Questions</h2>
        </Grid.Row>
        <Grid.Row>
          <Container className='about' textAlign='justified'>
            <Divider />

            <h3>Usage</h3>

            <h4>How do I use SNPDogg?</h4>
            <p>
              Just enter a set of missense mutations (hg19 build) into the white
              box on the SNPDogg homepage. Then hit “Enter.”
            </p>

            <h4>How do I interpret the SNPDogg score?</h4>
            <p>
              The SNPDogg score ranges from 0 to 1, with larger values
              indicating a higher confidence in pathogenicity. The score
              emphatically should not be interpreted as a probability of
              pathogenicity. In general, a score of 0.8 or higher is a pretty
              confident indication that the SNP causes loss of protein function.
              Predicted false-positive/false negative rates for different values
              are shown in our publication and here.
            </p>

            <h4>How do I interpret the feature contribution plot?</h4>
            <p>
              The features are arranged in order of their contribution to the
              prediction (so that the first feature is most powerful). The
              horizontal axis gives the contribution of the feature to the
              diagnosis (either towards benignness or pathogenicity). The color
              gives the actual value of the feature in the SNP, normalized as a
              Z-score relative to the whole training set.
            </p>

            <h4>
              Why do some feature contribution bars show up black? Is this a
              bug?
            </h4>

            <p>
              Black represents a null value, i.e. the feature is missing. The
              xgboost algorithm is smart enough to train on/utilize null values
              by imputing values.
            </p>

            <h4>
              Why is this weird feature (e.g. proportion of T’s in the 5 prime
              UTR) contributing to my prediction? It is clearly absurd that
              nucleotide proportions should be usable to predict pathogenicity.
            </h4>
            <p>
              The SNPDogg algorithm seems to use nucleotide proportions as a
              proxy for gene identity. Thus, if the algorithm sees that the
              SNP’s UTR contains 20% thymine, this narrows down the list of
              possible genes that the SNP could reside in. The specific gene
              under consideration is, clearly, a useful variable in making the
              prediction.
            </p>

            <h4>What does the "raw/meta" button do?</h4>
            <p>
              It switches between two different versions of SNPDogg. The “meta”
              version uses other missense pathogenicity predictors (such as
              GERP) in its prediction, in addition to all the features used in
              the “raw” model. For intuitive feature-explanations we recommend
              the raw model, even though the meta model shades it very slightly
              in accuracy.
            </p>

            <h4>What is the feature name suppose to mean?</h4>

            <p>
              Detailed descriptions of all features are given here. If you have
              further questions, please don’t hesitate to contact us.
            </p>

            <h4>
              Can you guarantee that the predictions of SNPDogg are accurate?
            </h4>

            <p>
              Of course not. However, our testing indicates that SNPDogg is
              significantly more accurate than any existing missense classifier,
              with the exception of REVEL (which it shades slightly). If someone
              offered us $10 million to successfully classify an unknown
              missense variant using any single in silico method we liked, we
              would use SNPDogg.
            </p>

            <h4>If I use SNPDogg in a publication, how do I cite it?</h4>

            <p>Please cite the SNPDogg paper directly.</p>

            <Divider />

            <h3>Technical</h3>

            <h4>
              What machine learning algorithm does SNPDogg use, and how many
              variants was it trained on?
            </h4>

            <p>
              SNPDogg uses gradient-boosted trees as implemented in the xgboost
              python package. It is trained on 36,799 variants that appear in
              the ClinVar, UniProt and VariBench datasets but do not appear in
              the HumVar, ExoVar, PredictSNP, SwissVar or our in-house IGM
              dataset. For more details on architecture, training and testing,
              see our publication.
            </p>

            <h4>How do you compute feature importance?</h4>

            <p>
              They are shapley values computed by the TreeSHAP algorithm
              implemented in the{' '}
              <a
                href='https://github.com/slundberg/shap'
                target='_blank'
                rel='noopener noreferrer'
              >
                "shap"
              </a>{' '}
              python package. Shapley values come from game theory. In a game of
              soccer (single variant interpretation), the final score of the
              team is the prediction from the machine learning model. The
              players who do not contribute to the score have shapley values of
              0, players who actively blocked or scored goals have higher
              positive values (the pathogenic drivers for the variant), and the
              players that hurt the score have more negative values (benign
              drivers of the variant). A game can have both pathogenic and
              benign drivers and are ordered by their absolute magnitude to see
              which players contributed the most to the score.
            </p>

            <h4>
              How does the "feature contribution" for each feature actually add
              up to the prediction?
            </h4>

            <p>
              Technically, they do not. The prediction is computed first, and
              then feature contribution is derived. However, the SNPDogg score
              can be recovered from the feature importance values because it is
              the exponential of the sum of all feature contribution scores
              multiplied by a baseline probability that is the same for all SNPs
              (around 0.6).
            </p>

            <Divider />

            <h3>Further Directions</h3>

            <h4>
              Can I download SNPDogg scores and features importances in bulk?
            </h4>

            <p>
              A full database of SNPDogg scores for most missense SNPs in the
              genome can be accessed here. The training/testing set, complete
              with all annotations used to train SNPDogg, can be accessed here.
              Persons interested in downloading a larger set of SNPs with all
              the training annotations or feature-importance values, should
              contact the authors here.
            </p>

            <h4>
              Have you thought about extending SNPDogg to other types of
              variants: nonsense, indels, non-coding, etc.?å
            </h4>

            <p>
              Yes, we have thought about it. Evaluating nonsense variants seems
              unnecessary since these are typically pathogenic. With regard to
              indels and non-coding variants, the pool of available variants for
              training/testing is drastically smaller and less reliable than in
              the missense case, making the problem much more challenging. We do
              not currently have any plans in this direction.
            </p>

            <Divider />

            <h3>Non-Scientific</h3>

            <h4>Why came up with the name SNPDogg?</h4>
            <p>The girlfriend of the first author.</p>

            <h4>Are you sure that Snoop Dogg is okay with this?</h4>

            <p>
              We are helping to cure sick kids! Of course, he is okay with it.
            </p>

            <h4>
              Do you not realize that the preferred nomenclature for a single
              nucleotide subsitution is "snv" (single nucleotide variant) rather
              than "snp"?
            </h4>
            <p>
              <a
                href='https://www.dictionary.com/browse/pedant'
                target='_blank'
                rel='noopener noreferrer'
              >
                Click here
              </a>{' '}
              to answer your question.
            </p>
          </Container>
        </Grid.Row>
      </Grid>
    )
  }
}

export default About
