README GMCONVERT v 0.31; Copyright (c) 2005-2006, Brant C. Faircloth

Disclaimer

GeneMapper(TM) is a registered trademark of Applera Corporation and its subsidiaries in the U.S. and other countries.

I am in no way associated with Applera, Perkin-Elmer, Applied Biosystems, etc. I wrote this program to help myself and those in my group work a bit faster. I thought it might help others.

About this program

This program is mean to convert files exported by Applied Biosystem's GeneMapper(TM) software to a format that is more useful. It essentially re-arrays the data from row to column format. Pretty much every program i have used for analysis demands the data be in column format of some sort.

The program is distributed as executables for both OS X and Windows (XP). These both have a simple GUI allowing user interaction. Neither require that python or wxpython (the windowing system) be installed as they are bundled within the application.

There is also a command-line version (gmconvert_command) that should be supported on all operating systems supporting Python and with Python installed. My machine runs python 2.4.2 at the moment, so you should have at least that version. If you do not, things may work strangely.

Caveats

Generally, I haven't found too many. However, you should be aware that your markers must have consistent names if they are placed in different panels. For example, if you have 'Primer15' in one panel and 'primer15' in another and these are actually the same primer, then you are going to have some problems. The main reason is that the program is case sensitive, so the previous examples are NOT the same. So, if you run your exported file with this sort of thing in there, then you will get separate results for 'Primer15' and 'primer15'. Bummer. This is easily remedied by ensuring your primer names are the same across any and all panels in which they happen to be found.

You should also be aware that + and - control samples will be exported from GeneMapper assuming they pass concordance control. This is stupid, but not my fault. So, once you have converted your files, you should remove these.

Finally, GMCONVERT does not count samples with GQ scores < 0.75. Typically, these should not be used in analysis. If you manually edit allele calls, your GQ score should be set to 1.0 (this occurs automatically). If there is interest, I may add an option to include calls of lesser quality in a subsequent release.

Platforms

I have tested the binary, executable versions of GMCONVERT on the following platforms:

  1. Apple OS X (10.4.4) - all worked well.
  2. Windows XP, SP2 - all worked well.

I have tested GMCONVERT_command on the following platforms. Helpful notes from my trials are included:

  1. Apple OS X (10.4.4), Python 2.3.5 - worked as advertised.
  2. Windows XP, Python 2.4.2 - worked as advertised.
  3. FreeBSD 5.4, Python 2.3.5 - works, but had invoke python with the following: /usr/local/bin/python2.3.
    You could change shebang or symlink /usr/local/bin/python2.3 to /usr/bin/python. If you are using FreeBSD, you should know what I'm talking about.
  4. Gentoo Linux (2.6.14), Python 2.4.2 - worked as advertised.

About the files in the archive

OS X - GUI

Distributed as a disk-image (.dmg) file:

Windows - GUI

Distributed as a Zip (.zip) file:

GMCONVERT_command

Distributed as a gzipped, tar archive (.tar.gz):

Installing

GUI Version - OS X

  1. Double click the .dmg to mount, drag gmconvert.app to a location on your hard drive (e.g. your Applications folder)
  2. Double-click to run
  3. Follow the instructions

GUI Version - Windows XP

  1. Unzip the .zip file
  2. Save the unzipped folder somewhere - DO NOT separate the contents of the unzipped folder! This will render the program non-functional.
  3. Double-click 'gmc' icon to run
  4. Follow the instructions

Command-line version

OS X

  1. Double click the program to unzip
  2. Leave the file (gmconvert_command.py) in the directory it created or move it wherever you like.
  3. Invoke the program by navigating to the folder in your command-line program of choice (e.g. terminal) and run it with (at minimum):

$python gmconvert_command.py

This should give you several prompts at which you can enter paths, options, etc.

You may also run the program by invoking command-line options:

where -f = 'cervus' | 'genepop' | 'gerud'

Therefore, the program can be run as such:

./gmconvert_command.py -i /Users/bcf/15_good_samples_3OLG_ls.txt -o /users/bcf/desktop/test5.csv -f gerud

Windows

Before using the program on windows, you will need to install Python. The current version is 2.4.2 and is available from python.org

Command-line version - Windows XP (SP2) Instructions - Easy way
  1. Download the ZIP archive (WinZIP should also extract the .tgz archive)
  2. Right-click and choose to 'Extract All...' to some folder (e.g. desktop)
  3. Drag your Genemapper output to this folder
  4. Double click on gmconvert_command.py
  5. Hit enter @ the first prompt
  6. Type the name of the file you dragged to the folder at the next prompt (don't forget to add .txt) (e.g. if the file you dragged was Coelacanth_genotypes.txt, enter 'Coelacanth_genotypes.txt')
  7. Choose an output format (cervus | genepop | gerud)
  8. Hit enter at the next prompt to save the outfile with the default name in the GMCONVERT folder
Command-line version - Windows XP (SP2) Instructions - Complicated way
  1. Download the ZIP archive (WinZIP should also extract the .tgz archive)
  2. Right-click and choose to 'Extract All...' to some location (e.g. desktop)
  3. Drag gmconvert_command to some folder (e.g. c:\GeneMapper\)
  4. Double click on gmconvert_command
  5. Hit enter at the first prompt
  6. Enter the path to your input file. Python deals with Windows directories strangely, so you will have to enter the path. Use forward slashes and don't forget the 'txt' c:/myFolder/myGeneMapperExportFile.txt
  7. Choose a format for the output.
  8. You can hit enter here to save in the current directory (in this example, it would be c:\myFolder) or you can save in another directory (with a different file name by entering)

c:/myOtherFolder/myGMCONVERTOutfile.csv (or .txt or nothing for genepop)

Note: Python on windows appears to deal with spaces in directory names effectively. So if your input file is in: c:\Documents and Settings\bcf\GeneMapper\output.txt

You would enter this upon running the program as c:/Documents and Settings/bcf/GeneMapper/output.txt (e.g. change backslashes to forward slashes)

Why are there 2 versions?

There are 2 versions for several reasons. There is a GUI version for OS X and Windows because it is easy to use. There is a command-line version because it can be used for batch-processing of numerous file given a little python or shell-scripting magic (this is up to you). The command-line version will also run on numerous platforms for which I don't have time to create a GUI.

Why only 3 file format options?

Well, because I am lazy. Seriously, cervus, genepop, and gerud ought to cover a lot of actual programs since many do cross-conversion of files from one type to another these days. If there is something that you absolutely must have, let me know and I will add it to the list.

What sort of license does this program have?

This program is released under the GPL (Gnu Public License). Details are included in gpl.txt. This program is also released with NO WARRANTY. Use it at your own risk (like swimming at the beach).

Technical details

The actual script was created using python, the windowing system uses wxPython, the application bundles were built with py2app (OS X) or py2exe (Windows), the icons were made in Photoshop, the .dmg file for OS X was created using DropDMG, and the Zip file for Windows was created on XP.

Bobwhite! 3/3/2005