LMs4TNT

Java tool to convert xy coordinates to a TNT data block


jt4tnt v 1.0, january/21/2016


by Julio Sandria & Efrain De Luna, INECOL

*****************

 

 

This program reads “x, y” coordinates of landmarks and semilandamraks from each specimen, estimates means for each group, and writes blocks of terminal names and the coordinates with the corresponding commands as required by TNT phylogenetic software.

Download "jt4tnt.class" and "jt4tnt.txt". Please note that “jt4tnt.class” is a java tool, thus it is platform independent. Java Tool for TNT requires Java 6 or later (Java 1.6). The file "jt4tnt.txt" contains basic help info on usage. The other files are an example of a data file with 8 configurations and a membership list for two species.

I guess you can cite this tool as:

Sandria, J. & E. De Luna. LMs4TNT, Java tool to convert xy coordinates to a TNT data block. jt4tnt v1.0, jan/21/2016. url: http://www.filogenetica.org/Java_tool/lms4tnt.htm (date accessed: xxxxxxx).

Download links

>> jt4tnt.class

>> jt4tnt.txt

examples

>> 8configs.tps

>> 8labels2spp.txt

 
     
 

Usage:

Place this Java tool and your input files in the same folder.

If the directory or folder is “lms4tnt”, the landmark data file is “8configs.tps”, and the membership list is “”8labels2spp.txt”, then you should type:

>cd lms4tnt;

>java jt4tnt -i 8configs.tps -o twospecies.tnt -m 8labels2spp.txt

 

First you will need to open your system command window to write a command line with several expressions separated by a space.

In your Mac. Open “Terminal” and change your location with the command “cd” to the folder containing the file jt4tnt.class and your input files.
Normally this will be achieved with a line command as follows, if your folder is “lms4tnt”:
your_computer:~ your_user_folder$ cd lms4tnt


the following message indicates your current location
your_computer:lms4tnt user_folder$

In your Windows machine. Use the command interface. Type the “cd” command to move your active location to the folder with the java tool and your data files.

 
 

Instructions
Typing the following command line, this converter tool will read x,y configurations (input file), estimate means for each group (member.txt) and will write the coordinates in the format required by TNT (output file).

>java jt4tnt [-options] -i <input.tps> -o <output.tnt> [-m <member.txt>]

-i <input.tps> for coordinate data file in TPS format, or use <input.txt> if the landmark data are "x,y" coordinates with CS, in row format
-o <output.tnt> output data file in TNT format; alternatively you can use a filename extension as *.txt.
-m <member.txt> membership data file in TXT format

options:
[ ] square brackets means optional
-d delete last two points (for ruler)
-version print version of jt4tnt
-? -help print this help message

Examples

Example of a command line, note that input data are coordinates in tps format:

>java jt4tnt -i my300configs.tps -o my10spp.txt -m membership10spp.txt

This command line will read 300 configurations in a tps file “my300configs.tps”, will calculate means for 10 groups according to the labels in “membership10spp.txt”, and will write these in the output file “my10spp.txt” with all necessary TNT commands.

Another example of a command line, but the input data are in a file with a TXT extension "yourlandmarks.txt" because contains superimposed coordinates and CS in row format:

>java jt4tnt -i yourlandmarks.txt -o yourlandmarks.tnt -m membership_list.txt

 

 
 

Input data files
When landmark data is in tps format, you should use file names as <input.tps>. Otherwise, use file names as <input.txt> for coordinates in row format.

Input data file in tps format.


This file converter can read the usual <tps> files produced with tpsDig software for landmark registration.

>java jt4tnt-i input.tps -o output.tnt -m member.txt


For example, this is the data file in tps format for two configurations with ten points.

<input.tps>


LM=10
758.00000 821.00000
1067.00000 872.00000
1353.00000 863.00000
1314.00000 790.00000
1251.00000 707.00000
1172.00000 640.00000
1112.00000 567.00000
1033.00000 526.00000
935.00000 517.00000
795.00000 629.00000
IMAGE=A. conco.CICIMAR-4009-3.jpg
ID=0

LM=10
603.00000 638.00000
781.00000 791.00000
1009.00000 897.00000
1008.00000 843.00000
974.00000 757.00000
939.00000 688.00000
922.00000 596.00000
864.00000 530.00000
792.00000 500.00000
674.00000 527.00000
IMAGE=A. conco.CICIMAR-2040.jpg
ID=1

If your tps landmark data contains coordinates for the ruler, please make sure thsee are the last two points. In this case remeber to use the optional comand -d as follows:

>java jt4tnt -d -i input.tps -o output.tnt -m member.txt

 

This command line will delete the coordinates of the last two points.

Input data file in row format.
Also, this file converter can read an input file <input.txt> with the superimposed configurations in row format <x y x y … CS> as produced by most morphometric software. For example IMP CoordGen saves aligned configurations by rows. Each row corresponds to one shape. The final number in each row is the CS value.

The command line should be as follows:

>java jt4tnt-i input.txt -o output.tnt -m member.txt

In this case, the file converter will read landmarks and CS to write two blocks of data for TNT. One block will contain the average values of centroid size. The second block will contain the landmarks. More below in the section "Output data files".

The following contains landmark data in row format, with eight points after a Procrustes superposition.

<input.txt>
-0.417667 0.245384 0.0148225 0.245394 0.402483 0.168824 0.304739 0.0460361 0.126001 -0.0795119 -0.0471027 -0.214345 -0.244854 -0.208623 -0.410425 -0.0245412 738.168
etc

In some morphometric tools, (for example CoordGen8), the data file with the superimposed configurations contains specimen labels following CS, at the end of each row. These labels are commonly placed after the percent sign %.

<input.txt>
-0.417667 0.245384 0.0148225 0.245394 0.402483 0.168824 0.304739 0.0460361 0.126001 -0.0795119 -0.0471027 -0.214345 -0.244854 -0.208623 -0.410425 -0.0245412 738.168 % Ibarra et al 1785 efan.jpg 4
etc


In this case, the file converter will not process all text after the % sign.

 
 

Membership list
Statistical morphometric programs, such as IMP, read membership lists using one number as label in each row. You can use the same membership list for this Java file converter.


The following example is the usual membership list file used for programs in the IMP series. In this case the “membership list” contains three labels for 15 configurations:

1
1
1
2
2
2
2
3
3
3
3
3
3
3
3

Also you can prepare a membership list more suited for TNT use, which instead of numbers, may contain names for each terminal in your tree search analysis.


The following is an example of a membership list with 3 species names used as “labels” for a data set with 15 configurations:

species1
species1
species1
species2
species2
species2
species2
species3
species3
species3
species3
species3
species3
species3
species3

In either case, numbers or text, the file with the membership list must be named as <filename.txt>.

 
 

Output data files

This file converter will write 2D coordinates in the format required by TNT.
The output file will contain in each row the “label” (taken from the “membership list”), a space, the “x” coordinate of each landmark, separated by a comma, followed by the “y” coordinate, with a space between each landmark.

If you provide coordinates in the tps format as <input.tps>, the file converter will take the x,y values, calculate the x,y average values for groups as defined by labels in the membership list, and save labeled rows in the TNT format, with the following TNT command lines at the beginning of the output file:

xread
numberC numberT
&   [  landmark  2d ]

 

label1 x,y x,y etc (average x, y values)
label2 x,y x,y etc (average x,y values)
….etc
labelT x,y x,y etc (average x,y values)
;
proc/;

where
“numberC” is the number of characters, configurations in this case, which will be “1”,
“numberT” is the number of terminals, identified with labels for each data row, usually the species name.

Using the files provided as examples, the following command line:

>java jt4tnt -i 8configs.tps -o twospecies.tnt -m 8labels2spp.txt

will write a file named twospecies.tnt that will look like this:

---------------------

xread

1 2

& [ landmark 2d ]

species1 1629.75,557 1504.75,492.5 1377.5,469 1248,447.5 1116.75,453.25 985.25,468.5 855,485 723.25,494 589,529.75 452.75,571 320.5,596.75 189,627.25 316.25,654 448.75,640.5 578.25,650.5 701.75,723.25 828.25,767.5 955.75,808 1082.75,854.75 1210.75,887.25 1340,892 1469.5,889.5 1604.75,850.75 828,256.25 971.5,253.5

species2 1636.25,557.75 1511.75,497 1382.5,468.75 1253.75,455.5 1123.25,459.25 991.25,474.25 859.5,489.5 727.5,499.25 594.75,535.5 461.5,578.25 327,603.75 194.5,630.75 323.75,648.25 454,645.5 584.75,652.5 709,724 836,768.25 963.5,810 1090.25,859 1219,888.25 1348.75,893.25 1479.25,887.75 1613.75,844.25 684.75,227.75 827.5,225.25

;

proc/;

---------------------

If you provide landmarks and CS values in row format as <input.txt>, the file converter will do the landmark averages, but additionally, it will take the last value in the input file and save the CS average values as a separate data block in the same output file, each line starting with the same label, a space, and the CS value.


In this case the output file for TNT will start with the TNT command “nstates cont” required for the analysis of CS as a continuous character. The second command is "xread". Then, a couple of numbers for characters and terminals. The first value will be a “2”, indicating two characters in the data matrix (CS and one configuration).

Example:

nstates cont;
xread
2 2


& [Cont]

species1 163.99
species2 44.006

& [ landmark 2d ]

species1 -0.282732, 0 -0.407714, 0.550193 -0.0137207, 0.749018 0.35604, 0.607499 0.348126,0
species2 -0.267554, 0 -0.457277, 0.51781 0.0129517, 0.666652 0.420655, 0.641745 0.291224,0
;
proc/;

where
“numberC” is the number of characters, 2 in this case, one character is CS, and the second character is the shape configuration,
“numberT” is the number of terminals, identified with labels for each data row, usually, the species name.

 
     
 

IF YOU HAVE MULTIPLE SHAPE CHARACTERS,

YOU WILL NEED TO INPUT FILES TO CONVERT ONE BY ONE