File format reference

https://www.cog-genomics.org/plink/1.9/formats

.map

(PLINK text fileset variant information file)

Variant information file accompanying a .ped text pedigree + genotype table. Also generated by "--recode rlist".

A text file with no header file, and one line per variant with the following 3-4 fields:

  1. Chromosome code. PLINK 1.9 also permits contig names here, but most older programs do not.
  2. Variant identifier
  3. Position in morgans or centimorgans (optional; also safe to use dummy value of '0')
  4. Base-pair coordinate

.ped

(PLINK/MERLIN/Haploview text pedigree + genotype table)

Original standard text format for sample pedigree information and genotype calls. Normally must be accompanied by a .map file; Haploview requires an accompanying .info file instead.

Contains no header line, and one line per sample with 2V+6 fields where V is the number of variants. The first six fields are the same as those in a .fam file. The seventh and eighth fields are allele calls for the first variant in the .map file ('0' = no call); the 9th and 10th are allele calls for the second variant; and so on.

A text file with a header line, and then one line per variant with the following 4-7 fields:

  1. CHR Chromosome code
  2. SNP Variant identifier
  3. BETA Regression slope for real data. Only present with "--qfam emp-se".
  4. EMP_BETA Sample mean of permutation regression slopes. Only present with "--qfam emp-se".
  5. EMP_SE Sample stdev of permutation regression slopes. Only present with "--qfam emp-se".
  6. EMP1 Empirical p-value (pointwise), or lower-p-value permutation count
  7. NP Number of permutations performed for this variant
FID IID PAT MAT SEX PHENOTYPE snp1_2 snp1_HET snp2_G snp2_HET
plink.bed      ( binary file, genotype information )
plink.fam      ( first six columns of mydata.ped ) 
plink.bim      ( extended MAP file: two extra cols = allele names)

.bed

(PLINK binary biallelic genotype table)

Primary representation of genotype calls at biallelic variants. Must be accompanied by .bim and .fam files.

.bim

= # of variant (SNP) (PLINK extended MAP file)

Extended variant information file accompanying a .bed binary genotype table.

A text file with no header line, and one line per variant with the following six fields:

  1. Chromosome code (either an integer, or 'X'/'Y'/'XY'/'MT'; '0' indicates unknown) or name
  2. Variant identifier
  3. Position in morgans or centimorgans (safe to use dummy value of '0')
  4. Base-pair coordinate (1-based; limited to 231-2)
  5. Allele 1 (corresponding to clear bits in .bed; usually minor)
  6. Allele 2 (corresponding to set bits in .bed; usually major)

.fam

= # of sample (PLINK sample information file)

Sample information file accompanying a .bed binary genotype table.

A text file with no header line, and one line per sample with the following six fields:

  1. Family ID ('FID')
  2. Within-family ID ('IID'; cannot be '0')
  3. Within-family ID of father ('0' if father isn't in dataset)
  4. Within-family ID of mother ('0' if mother isn't in dataset)
  5. Sex code ('1' = male, '2' = female, '0' = unknown)
  6. Phenotype value ('1' = control, '2' = case, '-9'/'0'--out/non-numeric = missing data if case/control)

.bed, .fam, .bim to .map and .ped

plink --bfile filename --recode --tab --out out_file

.map and .ped to .bed, .fam, .bim

plink --file filename --make-bed --out out_file