Monday 5 November 2012

Running GeneWise with HMMs

A nice feature of Ewan Birney's GeneWise software is that GeneWise can use HMMs of gene families to help predict genes in DNA sequence.

This can be done using:
% genewise <hmmfile> <fasta> -hmmer ... [other options]
where <hmmfile> is your HMM file, and <fasta> is the fasta file for your DNA sequence.

In a previous post, I described how to train GeneWise so that it uses a splice site parameter file that has been trained for your species.

If you want to run GeneWise with HMMs, and also want to use a splice site parameter file for your species, you will need to type:
% genewise <hmmfile> <fasta> -hmmer -genestats <paramfile> -nosplice_gtag  ... [other options]
where <paramfile> is your splice site parameter file.

The above command can only be used to compare one HMM to one DNA sequence.

The GeneWise software comes with a program called genewisedb, which can be used to compare multiple HMMs to a fasta file of multiple sequences. However, unfortunately, genewisedb does not have the -genestats option, to allow you to use your own splice site parameter file.

If you want to use the -genestats option, to use your own splice site parameter file, you can use my perl script run_genewisedb.pl to run genewise, by comparing each HMM in your input file of multiple HMMs, to each DNA sequence in a fasta file of multiple sequences.

No comments: