#CHANGE LOG for ALEXA-Seq Analysis Code #ALEXA-SEQ VERSION 0.1 - Beta release for external collaborators #ALEXA-SEQ VERSION 1.1 - First public release #ALEXA-SEQ VERSION 1.2 - Numerous bug fixes and updates - Performed complete code test for both the annotation and analysis arms - Tested the annotation arm by generating an annotation DB for Zebrafish: 'dr_57_8c' (now available online at www.AlexaPlatform.org) - Tested the analysis arm by generating analysis for the project: 'LnCAP_AR_KnockIn' (now available online at www.AlexaPlatform.org) #ALEXA-SEQ VERSION 1.3 - Numerous minor bug fixes - Added Xapian-Omega template files to the source code repository (as examples) - Added additional project and annotation configuration file examples #ALEXA-SEQ VERSION 1.4 - Fixed missing directories bug in fresh build of analysis dirs - Added extra instructions for automated installation of BioPerl - Fixed bug with jobs that determine the number of quality read counts - Simplified summary of quality read counts and average read length - Modified calculation of library complexity jobs to avoid collisions between multiple projects being processed at the same time - Fixed bug caused by unusual duplication of a few ensembl exon record in EnsEMBL 57 - Fixed minor bug in naming of junction and boundary parsing log files - Removed accidental hard coding of server name in some expression calculation jobs - Added code to help check completion status of parsing jobs #ALEXA-SEQ VERSION 1.5 - Updated code to allow better control of stringency of hits to exon junctions/boundaries (particularly important for short reads such as 36-mers) - Updated examples parameter settings for different read lengths - Removed all mapping and assignment parameters from the project configs files. These parameters are now determined automatically and without input from the user. - Modified use of mapping and assignment parameters to accomodate libraries made up of lanes with different read lengths - Added a utility to assist in the resetting all read assignments - Added additional memory usage monitoring to read assignment steps - Fixed bug in creation of ALEXA-seq annotation databases that resulted in missing Entrez gene name information - Reduced memory footprint of all read assignment jobs - Improved tracking of creation of ALEXA-Seq viewer data for each gene - Added code to assist in packaging of results files for distribution to collaborators #ALEXA-SEQ VERSION 1.6 - Added sanity check for non-unique flowcell-lane combinations - Added sanity checks for lane, library and comparison records in user provided config files - Removed hard coding of navigation links for ALEXA-Seq gene records, etc. - Modified expression calculations so that values are normalized for library size differences according to mapped bases instead of mapped reads - Eliminated hard coding of Google Analytics ID and made this a global parameter specified by the user - Updated example configuration files and re-organized these - Improved efficiency of File I/O usage for all expression calculation steps and added QC checks to ensure these steps completed successfully - Updated mapping parameter values to accomodate reads up to 100 bases in length - Added lane-by-lane quality analysis for each library to the ALEXA-Seq viewer - Added lane-by-lane summary of mapping statistics to the ALEXA-Seq viewer - Added position bias analysis for each library to the ALEXA-Seq viewer - Added percent gene coverage analysis for each library to the ALEXA-Seq viewer - Added tag redundancy (library complexity) analysis and plots to the ALEXA-Seq viewer - Added basic library info to the ALEXA-Seq viewer - Added highlighting of Significant AE events to the ALEXA-Seq viewer #ALEXA-SEQ VERSION 1.7 - Various minor bug fixes - Various performance improvements - Added automatic generation of candidate gene lists for AE+DE genes, AE genes only, gains only, and losses only - Updated main project summary records in data viewer to include more basic info on the project and the analysis - Updated gene records in data viewer to use smarter color schemes in line plots - Added gene model diagrams to gene records in data viewer - Added automatic generation of candidate peptides for design of antibodies that are potentially specific to each condition #ALEXA-SEQ VERSION 1.8 - Various minor bug fixes - Various display tweaks of the data viewer - Implemented use of BWA aligner (the user can select to use BLAST only or BWA+BLAST) - Added second method of estimating library complexity - using mapping positions instead of sequence identity - Added beta code to modify data viewer for larger datasets (i.e. projects involving many libraries) - Added beta code to perform statistics on comparisons of library groups in addition to pairwise library comparisons - Added code to automate installation of ALEXA-Seq annotation databases and associated mysql databases #ALEXA-SEQ VERSION 1.9 - Reduced memory usage of estimate library complexity jobs - Added BWA only option - Various minor bug fixes related to implementation of BWA alignments #ALEXA-SEQ VERSION 1.10 - Various minor bug fixes in code - Various bug fixes and performance improvement within the Virtual Machine - Added beta code that optimizes the data viewer for projects with larger numbers of libraries - Added beta code that performs differential and alternative expression analysis using GROUP comparisons instead of pairwise - Added beta code that used ANCOVA to identify interesting AE events between arbitrary numbers of sample groups #ALEXA-SEQ VERSION 1.11 - Fixed bug that broke the demo in Virtual Machines - Added step to partition the ALEXA-Seq annotation database by genome region (allows for memory and fileI/O performance improvements in various downstream steps) - Improved file I/O performance of gene record generation for the data viewer - Improved file I/O performance for all expression calculation steps - Reduced memory usage during identification of candidate genes - Reduced memory usage during identification of candidate peptide sequences #ALEXA-SEQ VERSION 1.12 - Various minor bug fixes - Various performance tweaks for the Virtual Machine Demo - Added code to identify the protein coding frame associated with all junction sequences - Improved file I/O performance of gene record generation for the data - Reduced memory usage during creation of Matrix data files - Reduced memory usage during Differential and Alternative Expression calculations - Reduced memory usage during identification of candidate genes (again) - Improved file I/O efficiency during creation of gene records in data viewer #ALEXA-SEQ VERSION 1.13 - Improved file I/O efficiency during creation of gene records in data viewer (again) #ALEXA-SEQ VERSION 1.14 - Added script to clean up temp files at the end of an ALEXA-seq analysis - Fixed regex in Web.pm to allow lib names with underscore - Minor fixes to handling of fastq files - Added new script to create dummy R2 qseq files for SE data - Various other minor tweaks #ALEXA-SEQ VERSION 1.15 - Minor formatting updates to create analysis commands steps - Added scripts to assist in the creation of multiple junction/boundary sequence lengths - Added information to manual and commands scripts explaining which jobs can be run concurrently #ALEXA-SEQ VERSION 1.16 - Various minor bug fixes - Improved performance of fastq data import - Fixed problem with BWA alignment parsing that allowed some alignments with gaps to be scored too highly - Added helper script to demonstate splitting large single lanes into smaller 'sub-lanes' #ALEXA-SEQ VERSION 1.17 - Dramatically reduce memory usage of generate expression values (Gene/Exon) when there are many redundant alignments (e.g. high coverage of MT genes) - Added library skew statistics and summary to the library summary output - Added support for recent versions of Ensembl (tested on v65) - Added support for null passwords in annotation database creation code - Fixed various places in the website generation code where the junction/boundary length was being hard-coded instead of retrieved from global config file