
The SNP ID is an internal reference for the SNP. All SNPs found in this study have been submitted to dbSNP (http://www.ncbi.nlm.nih.gov/SNP/) and can be searched based on these values. SNPs detected from a literature search were not submitted to dbSNP.
The Exon most often represents the exon a SNP is contained in or adjacent to. Sequence within an "exon" can include flanking intronic segments or 5' and 3' UTR.
The Position in exon is the location of a SNP relative to the "exon" record. It does not refer to a specific location in the true exon relative to any standard naming convention.
The Confirmation is the current status of each SNP. All SNPs are "candidates" until proven as SNPs based on an independent method. Methods of confirmation include sequencing, identifying polymorphic
individuals on a genotyping array, and comparision to previously reported sites. A SNP is either "Confirmed" or "Not Confirmed." Often SNPs, particularly rare SNPs, are not polymorphic in all populations, only in one or some.
The Call Quality is either "Certain" or "Likely" for each SNP as based on a determination using the program Ulysees version 1.0. "Certain" SNPs have been confirmed 100% of the time in this study. "Likely SNPs have been confirmed 71% of the time in this study. Polymorphisms that
have been identified as false positives are often in the rare heterozygosity class of SNP.
The Heterozygosity Class is either "Rare," "Low," "Medium," or "High." These refer to SNPs whose heterozygosity (as calculated by 2pq) are between 0.01-0.10, 0.10-0.20, 0.20-0.40, and 0.40-0.50 respectively.
The Sample Detected refers to the detection of a SNP in one or four sample collections. The "African" sample, of individuals from Harare, Zimbabwe, consisted of 80 chromosomes. The "U.S. White" sample, of individuals from Tecumseh, Michigan and 2 CEPH individuals (1331-01,-2) consisted of 68 individuals.
"Both" represents the SNP was detected in both the African and U.S. White populations. "cDNA" refers to a set of 88 chromosomes obtained from cell lines.
The Location refers to the position of a SNP relative to a gene. The positions are "Promoter," "5' UTR," "Coding," "Intron," and "3' UTR." "Coding" refers to SNPs in the traslated regions of genes and does not include any untranslated regions of exons.
If the SNP is in a "Coding" location, amino acid information is provided. Amino Acid 1 represents the amino acid (in three letter form) from one of the two alleles.
If the SNP is in a "Coding" location, amino acid information is provided. Amino Acid 2 represents the amino acid (in three letter form) from the second of the two alleles.
The Type of Amino Acid Change can be either a "Synonymous" (does not alter the amino acid) or "Nonsynonymous" (alters the amino acid) change.
The Codon Position refers to the position in the codon where a SNP was located. The value can be either "1," "2," "3," or "0" corresponding to the 1st codon position, the 2nd codon position, the 3rd codon position, and a SNP not occuring in a codon, respectively.
The Codon 1 refers to the three nucleotide code for an amino acid, at which a SNP was detected.
The Codon 2 refers to the alternative three nucleotide code for an amino acid, at which a SNP was detected.
The GenBank Accession value is a segment of DNA that the SNP is located within. Each GenBank Accession value is hyperlinked to the GenBank record for additional information.
The Allele 1 value refers to the "reference nucleotide allele" at a given position. This reference status is designated by it being the nucleotide found on the GenBank record. This "Allele 1" is not always the more common allele.
The Allele 1 Frequency is the frequency of that allele in the samples used for its detection. If the only sample a SNP was detected in was the African sample, then this frequency is based upon 80 chromosomes. If it was detected in "Both" samples, then it's frequency was based on 148 chromosomes. This value was calculated as p/(p+q) and has been rounded to the nearest 0.05 frequency.
The Allele 2 value refers to the "alternative nucleotide allele" at a given position. This nucleotide was discovered in the SNP detection survey. It is not always the rare allele.
The Allele 2 Frequency is the frequency of that allele in the samples used for its detection. If the only sample a SNP was detected in was the African sample, then this frequency is based upon 80 chromosomes. If it was detected in "Both" samples, then it's frequency was based on 148 chromosomes. This value was calculated as q/(p+q) and has been rounded to the nearest 0.05 frequency.
The Sequence is a flanking segment of ~100 bp of DNA for each polymorphism. The site of the polymorphism is designated as "*" in each sequence record.
General considerations:
Specific help for columns.
SNP ID
Exon
Position in exon
Confirmation
Call Quality
Heterozygosity Class
Sample Detected
Location
Amino Acid 1
Amino Acid 2
Type of Amino Acid Change
Codon Position
Codon 1
Codon 2
GenBank Accession
Allele 1
Allele 1 Frequency
Allele 2
Allele 2 Frequency
Sequence