Welcome to the Maize pan-genome beta site

Maize B73 Genome Assembly and Gene Annotations

An entirely new assembly of the maize genome (B73 RefGen_v4) was constructed from PacBio Single Molecule Real-Time (SMRT) sequencing at approximately 60-fold coverage and scaffolded with the aid of a high-resolution whole-genome restriction (optical) mapping. The pseudomolecules of maize B73 RefGen_v4 were assembled nearly end-to-end, representing a 52-fold improvement in average contig size relative to the previous reference (B73 RefGen_v3).

Genes were annotated with the Maker pipeline (Campbell et al, 2014) using 111,000 transcripts obtained by single-molecule sequencing. These long-read Iso-Seq data (Wang et al, 2016) improved annotation of alternative splicing, more than doubling the number of alternative transcripts from 1.5 to 3.8 per gene, thereby improving our knowledge of gene structure and transcript variation, resulting in substantial improvements including resolved gaps and misassembles, corrections to strand, consolidation of gene models, and anchoring of unanchored genes.

Gene annotation was performed in the laboratory of Doreen Ware (CSHL/USDA). Protein-coding genes were identified using MAKER-P software version 3.1 (Campbell et al, 2014) with the following transcript evidence: 111,151 PacBio Iso-Seq long-reads from 6 tissues (Wang et al, 2016), 69,163 full-length cDNAs deposited in Genbank (Alexandrov et al, 2008; Soderlund et al, 2009), 1,574,442 Trinity-assembled transcripts from 94 B73 RNA-Seq experiments (Law et al, 2015), and 112,963 transcripts assembled from deep sequencing of a B73 seedling (Martin et al, 2014). Additional evidence included annotated proteins from Sorghum bicolor, Oryza sativa, Setaria italica, Brachypodium distachyon, and Arabidopsis thaliana downloaded from Ensembl Plants Release 29 (Oct-2015). Gene calling was assisted by Augustus (Keller et al, 2011) and FGENESH (Salamov & Solovyev, 2000) trained on maize and monocots, respectively. Low-confidence gene calls were filtered on the basis of an Annotation Edit Distance (AED) score and other criteria and are viewable as a separate track. In the end, the higher confidence set (called filtered gene set) has 39,324 protein coding genes. Gene annotations from B73 RefGen_v3 were mapped to the new assembly and are also available as a separate track. In addition, 2,532 long non-coding RNA (lncRNA) genes were mapped and annotated from prior studies (Li et al, 2014; Wang et al, 2016), while 2,290 tRNA genes were identified using tRNAscan-SE (Lowe & Eddy, 1997), and 154 miRNA genes mapped from miRBase (Kozomara & Griffiths-Jones, 2014).

Maize W22 Genome Assembly

A new genome assembly of Zea mays W22 is available here. The assembly (Zm-W22-REFERENCE-NRGENE-2.0) was generated in Roy J. Carver Biotechnology Center (Urbana, IL) at the University of Illinois and it's accompanying annotation was produced by the MAKER-P software in Doreen Ware's Lab.

This sequence has been released under the Toronto Agreement. No whole-genome research may be submitted for publication until the official publication for this genome assembly has been published.

Maize PH207 Genome Assembly

A new genome assembly of Zea mays PH207 is available here. The assembly (Zm-PH207-REFERENCE_NS-UIUC_UMN-1.0) was generated by Candy Hirsch's group at the University of Minnesota as well as it's accompanying annotation. PH207 is a key founder line to Iodent germplasm and represents an alternative to B73 (stiff stalk germplasm).

What's in this release

  • Three Zea_mays genomes
  • Compara analysis
    • Protein comparative analysis generated a total of 23,442 GeneTree families comprising 422,202 individual genes (480,265 input proteins) from 15 species
    • Pairwise wholes genome alignments were constructed tween the following genomes
      • Zea mays B73 v.s. Zea mays W22
      • Zea mays B73 v.s. Sorghum bicolor
      • Zea mays W22 v.s. Sorghum bicolor
    • Synteny were calculated based on orthologs between genomes including the following pairs
      • Zea mays B73 v.s. Zea mays W22
      • Zea mays B73 v.s. Zea mays PH207
      • Zea mays PH207 v.s. Zea mays W22
      • Zea mays B73 v.s. Sorghum bicolor
      • Zea mays W22 v.s. Sorghum bicolor
      • Zea mays PH207 v.s. Sorghum bicolor