LFE - Large File Editor

LFE - file editing commands (to change file content)

LFE location on the UT cluster: /gpfs/gvgpfs/gvhome/toomasha/SOFT/LFE64
All others please DOWNLOAD LFE HERE

FUNCTION	SHORT DESCRIPTION
CHANGEDELIM	Change file delimiter.
DOUBLEMATH	Math with two tables (matrices).
EXTRACTFROMTABLE	Extracts multiple values from table based on a file with x and y coordinates.
IFTHENCOL	Replaces a value in column 1 with a value in a column 2 if the column 1 value is in a reference file.
LINESPACING	Add new rows (empty or not) after every specified number of rows. Header rows optinally unaffected.
LIST2TABLE	Converts a list into a table.
LISTREPLACE	Replace entries in your table using a reference list.
LONGSORT	Sort all rows in ascending order. No file or memory size limit!
MATH	Standard math functions using entries from two columns.
MERGECOLUMNS	Merge any number of columns in any order with or without delimiter.
MULTBYE	Multiply or divide values by a scientific notation (e) number. No size limit.
POINTREPLACE	Replace text at a specific location of a tabular file.
PREAPPEND	Prepend or append text to entries. Allows automatic filling until certain length is achieved.
RANDOMIZEROWS	Randomize a list by changing the order of rows. Making random subsets is possible.
RANGEMATH	Add (sum) or average (mean) values accross columns.
REMOVEEMPTY	Remove empty rows or entries from the table; rewrite file for your current OS.
REPLACE	Find and replace text or entries. Also fill in empty data fields.
ROWSORT	Sort all rows in ascending order. File must fit in memory.
RUNSINDEX	Generate indexes based on values in a specified column. Index increments up by one every time the upward run is disrupted by a dip in the values.
SINGLEMATH	Perform exp, ln, log, sqrt, square, inv, rev etc. on any or all columns of your table.
SWAPCOLUMNS	Swap two columns.
TABLE2LIST	Converts table into a list.
TRANSPOSE	Transpose table.

CHANGEDELIM

This function changes the file delimiter.

./LFE32 -M changedelim -file -delim1 -delim2 -out

-file = file name. REQUIRED.
-delim1 = original file delimiter. REQUIRED. Options: a) "tab", "space", "comma", "semicolon", "colon", "pipe" (this means |), b) any symbol or word without spaces. Default = "tab".
-delim2 = original file delimiter. REQUIRED. Options: a) "tab", "space", "comma", "semicolon", "colon", "pipe" (this means |), "none", b) any symbol or word without spaces. Default = "tab". Notes: "none" will remove the previous delimiter.
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M chengedelim -file myinput.txt -delim1 tab -delim2 ***
Delimiter is changed from tab to "***". Output file is the default: "myinput.txt.out".

DOUBLEMATH

This function performs functions with two tables (matrices). See SINGLEMATH for comparison. Two files (tables) are used as an input. Math is performed only on the specified columns, otherwise table1 values are retained. The two tables are aligend using the upper left entry for math. The tables don't have to be of the same size but the resulting file is truncated accordingly if they are not.

./LFE32 -M doublemath -file1 -file2 -function -columns -missing -header -delim -out

-file1 = input file1 name. REQUIRED.
-file2 = input file2 name. REQUIRED.
-function = mathematical function. REQUIRED. Options: "add", "subtract", "multiply", "divide", "larger" (larger value is retained), "smaller" (smaller value is retained), "lessmissing" (the nonmissing value is preferably retained), "moremissing" (the missing value is preferably retained).
-columns = what columns to use. REQUIRED. Notes: columns should be separated by commas and ranges by dashes (see the example below); there should be no spaces.
-missing = what is used to indicate the missing values. Options: any symbol or word without spaces. Default = "NA".
-header = file header. Both files need to have the same header state. Options: "yes", "no". Default = "no".
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", "colon", "slash", "bslash", "dash", "quote", "squote" b) any symbol or word without spaces. Default = "tab".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M doublemath -file myfile.txt -function divide -columns 1,7-9 -missing na -header yes -delim space -out results.txt
Values of table1 are divided by values of table2. If one value is missing, or table2 value is 0, "na" is inserted in the resulting table.

EXTRACTFROMTABLE

Uses an input file with coordinates to extract multiple values from a table.

./LFE32 -M extractfromtable -file -list -delim -header -out

-file = input file name. REQUIRED.
-list = the list file with x and y coordinates (one pair per row, separated as defined by "-delim"). This file must not have a header. REQUIRED.
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", "colon", "slash", "bslash", "dash", "quote", "squote" b) any symbol or word without spaces. Default = "tab".
-header = input file header. Options: "yes", "no". Default = "no".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M extractfromtable -file myfile.txt -list coordinates.txt -delim space -header yes -out coordinates2.txt
Here values are extracted from myfile.txt according to the coordinates in coordinates.txt.

IFTHENCOL

Replaces a value in column 1 with a value in a column 2 if the column 1 value is in a reference file.

./LFE32 -M ifthencol -file -list -column1 -column2 -match -delim -header -out

-file = input file name. REQUIRED.
-list = list containing values for column 1. REQUIRED.
-column1 = column 1 number. REQUIRED. Options: any legal integer. Default = none.
-column2 = column 2 number. REQUIRED. Options: any legal integer. Default = none.
-match = how matching is done. Options: "full", "partial". Default = "full".
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", "colon", "slash", "bslash", "dash", "quote", "squote" b) any symbol or word without spaces (use backslash for escape if the symbol used interferes with command line). Default = "tab".
-header = input file header. Options: "yes", "no". Default = "no".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M ifthencol -file f.txt -list l.txt -column1 3 -column2 5 -match partial -delim \. -header yes -out results.txt
Here values in column 3 are replaces with values in column 5 if column 3 if the column 3 value is found in file l.txt.

LINESPACING

This function adds empty rows (new line spaces) or any other text ("-symbol") any number of times ("-spacing") after each row. Rows can be systematically skipped ("-skip"). The added text can be repeated within the row ("-repeat"). It is possible to specify a number of first rows that remain unaffected ("-wait").

./LFE32 -M linespacing -file -out -symbol -wait -skip -spacing -repeat

-file = input file name. REQUIRED.
-out = output file name. Default = input file name + ".out".
-symbol = what to insert in the new rows. Options: any text, "newline", "tab", "space", "comma", "semicolon", "colon", "bslash", "slash", "dash", "quote", "squote", "dot", "asterisk"="star". Default = "newline".
-wait = how many first rows remain unafected by the new spacing. Options: any integer. Default = 0.
-skip = how many rows should be skipped every time before new row(s) is(are) inserted. Options: any integer. Default = 0. Note: Number 1 would mean that rows stay together in groups of 2.
-spacing = how many times new row (space or text) should be inserted every time. Options: any integer. Default = 0. Note: To get one empty row after each row this number would have to be 1.
-repeat = how many times the new text should be repeated within each new row that is inserted. Options: any integer. Default = 1. Note: This option does not work when empty rows are insterted with "-symbol newline".

Example:
./LFE32 -M linespacing -file mydata.txt -out results.txt -symbol A -wait 5 -skip 2 -spacing 1 -repeat 10
Here 5 first rows remain unchanged, then 1 new row inserted after every 3 rows. The new row is 10 times letter "A": "AAAAAAAAAA".

LIST2TABLE

This function converts a list into a table. Missing values are replaced as defined by "-missing". The list can also be inserted into a pre-esxisting table (thus altering the background table). See TABLE2LIST for the opposite functionality. Together these functions may come in handy for modyfying tables.

./LFE32 -M list2table -file -table -header -dim -missing -delim -out

-file = input file name. REQUIRED. Note: that the list must not have a header.
-table = the name of the background table if it exists. Options: a valid file name. Note: if you don't have a background table to insert the list into, simply omit this flag.
-header = backgroung table header. Options: "yes", "no". Default = "no".
-dim = dimensions of the table to be built by this function. Options: use format -dim 'rows','columns' Note: If "-dim" is not used, the minimal size is found automatically and used. If "-table" is used, "-dim" is ignored and the list is inserted into the existing table as it is. If "-dim" is in effect but the list doesn't contain enough values, the missing values are filled in as defined by "-missing".
-missing = what is used to indicate the missing values. Options: any symbol or word without spaces. Default = "NA".
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", "colon", "slash", "bslash", "dash", "quote", "squote" b) any symbol or word without spaces. Default = "tab".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M list2table -file mylist.txt -table mytable.txt -header yes -dim 100,120 -missing MA -delim space -out updated_table.txt
Here mytable.txt is is updatated with info from mylist.txt. The new table is defined as to contain 100 rows and 120 columns. However, this flag is ignored and the dimesions of the original table are used instead.

LISTREPLACE

This function replaces all entries in a specified column of the input file with the entries specified in a list file. It does not remove empty rows.

./LFE32 -M listreplace -file -list -header -column -replace-how -delim -out

-file = input file name. REQUIRED.
-list = list file name. REQUIRED.
Notes: this file should contain pairwise matches of old and new entries, one pair per row, separated by space or tab.
-header = file header. Options: "yes", "no". Default = "no".
-column = column name. Options: a) valid column number, b) "all" (replaces entries in all columns). Default = "all".
-replace-how = replacment mode. Options: "onlyfull", "alsopart". Default = "onlyfull". Note: "onlyfull" replaces the entry only if the match is full, "alsopart" replaces parts of the entry even when the match is not full (Example: "automobile" may become "mymobile" when "auto" is instructed to be replaced with "my".
-delim = file delimiter. Options: a) "tab", "space", b) any symbol or word without spaces. Default = "tab".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M listreplace -file myfile -list pairs.txt -header yes -column all -replace-how onlyfull -delim space -out results.txt
Entries in all columns of myfile.txt (which contains a header) are replaced with the new entries according to the pairwise matches specified in pairs.txt.

LONGSORT

This function sorts rows in ascending order. It treats all rows as simple lines of continuous text. There is no file size or memory limit. Files that are too large to be stored in memory will be sorted in smaller chunks and the pre-sorted chunks are then merged. Note: The large file size handling capability comes at the cost of computational speed.

./LFE32 -M longsort -file - chunksize -header -out

-file = input file name. REQUIRED.
-chunksize = the maximal number of rows that fit in your computer memory. Options: any integer. Default = 100000. Note: if this number is too small, the sorting is done in very many chunks and the many merge steps that follow make the algorithm slow; if this number is too large, the chunks may not fit in the memory.
-header = input file header. Options: "yes", "no". Default = "no".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M longsort -file myfile.txt - chunksize 1000000 -header yes -out results.txt
All rows, except for the header, are sorted in the ascending order. Total of 1 million rows are sorted in each sub-sorting step. The sub-sorting results are written into temporary files that are deleted during runtime.

MATH

This function performs standard mathematical functions using numerical entries from two columns. The functions include: addition, subtraction, multiplication, division, power, ln, log, square root, reverse, sin, cos, tan, copy. The results are written into a new column (the location of which can be chosen). Other similar LFE math functions are RANGEMATH and SINGLEMATH.

./LFE32 -M math -file -function -header -delim -missing -put-before -out

-file = input file name. REQUIRED.
-function = mathematical function together with data location. REQUIRED. Options: "+", "-", "*", "/", "power" (to power of), "ln", "log", "minuslog" (negative log), "sqrt" (square root), "inv" (inverse value), "sin", "cos", "tan", "copy" (copy value from). Usage: The first 5 functions take 2 parameters (one before the function name, one after; NO SPACES allowed), the others take 1 parameter (after the function name; NO SPACES allowed). Numerical values can be used directly, column locations must be preceded with "col". Examples: "col2-col3" means that column 3 values are subtracted from column 2 values; "col5+8" means that number 8 is added to the values of column 5; "sqrtcol5" means that square root is taken from column 5 values; "copy2" means that the number 2 is copied into the new column; "copycol2" means that the values of column 2 are copied into the new column.
-header = file header. Options: "yes", "no". Default = "no".
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", b) any symbol or word without spaces. Default = "tab".
-missing = what is used to indicate the missing values. Options: a) any symbol or word without spaces, b) "none". Default = "NA". Note: "none" means that the missing entry fields are empty.
-put-before = the new column (with the newly calculated numbers) is put before this column. Options: a) any positive integer. Default = 1. Note: If you want to put the new column last, make this number larger than the number of columns in your data table.
-out = output file name. Default = input file name + ".out".

Example1:
./LFE32 -M math -file myfile -function invcol7 -header yes -delim space -missing NO -put-before 8 -out results.txt
Entries from column 7 are inversed (1/x) and the values are placed in a new column (before the old column 8). The missing value is defined as "NO".

Example2:
./LFE32 -M math -file myfile -function 2*col19 -header yes -delim semicolon -missing none -put-before 1 -out results.txt
Entries from column 19 are multiplied by 2 and the values are placed in a new column which is the first column. The missing value is defined as "none" which means "empty".

MERGECOLUMNS

This function merges the specified columns with or without placing a user-defined delimiter between the merged entries. The merged column is placed where the first to-be-merged column was located. If you want to change the location of the newly merged column, please refer to the LFE function SWAPCOLUMNS.

./LFE32 -M mergecolumns -file -columns -delim -merge-delim -out

-file = input file name. REQUIRED.
-columns = columns to be merged. REQUIRED. Options: any valid column numbers or ranges without spaces (Ex: 5,1-2 means that column 5 is merged with column 1, which is in turn merged with column 2; the merged column will appear in place of the old column 5). Note: The numbers and number ranges must be valid and non-overlapping (Ex: 4,1-5 and 2-4,4-9 are examples of overlapping ranges). You can change the merge order by typing in different sequences (Ex: 1,3,5-8 is different from 1,3,8-5 or 3,5-8,1).
-delim = file delimiter. Options: a) "tab", "space", "semicolon", b) any symbol or word without spaces. Default = "tab".
-merge-delim = merge delimiter. Options: a) "none", "tab", "space", "semicolon", b) any symbol or word without spaces. Default = "none". Note: Merge delimiter is placed between the merged columns. You may want to use an option other than "none" if you need to re-separate the merged columns later.
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M mergecolumns -file mytable.txt -columns 11,8,3-5 -delim space -merge-delim comma -out output.txt
In this example columns are merged in the order of 11,8,3,4,5, the corresponding original columns are removed, and the merged column is placed where column 11 used to be.

MULTBYE (specialized)

This function multiplies or divides all values in a specified colum by scientific (e-) notation number (e.g. 2.3e8). Typically computers cannot handle very small or very large values (e.g. 5e-600). This function does not have this limitation. Values larger than e3 or smaller than e-3 are presented using the e-notation.

./LFE32 -M multbyE -file -e -header -missing -column -delim -function -out

-file = input file name. REQUIRED.
-e = number using the e-notation. REQUIRED.
-header = input file header. Options: "yes", "no". Default = "no".
-missing = used to indicate the missing values. Options: a) any symbol or word without spaces, b) "none". Default = "NA". Note: "none" means that the missing entry fields are empty. The rows with missing values are not considered.
-column = column number. Options: a) "all", b) any valid column number. Notes: the "all" option should only be used if all columns contain numbers. Default = "all".
-delim = file delimiter. Options: a) "tab", "space", b) any symbol or word without spaces. Default = "tab".
-function = mathematical function. Options: "multiply", "divide". Default = "multiply".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M multbyE -file myfile.txt -header yes -e -3.1e-250 -column 2 -delim tab -function divide -out results.txt
All values in column 2 are divided by -3.1e-250.

POINTREPLACE

Replace text at a certain location in a tabular file with new text (optionally: provided that the previous text is not as defined).

./LFE32 -M pointreplace -file -row -column -newtext -ifnot -delim -out

-file = input file name. REQUIRED.
-row = row number. REQUIRED.
-column = column number. REQUIRED.
-newtext = the new text that replaces the old text. REQUIRED.
-ifnot = the text that is not allowed to be replaced with "newtext". This can be ignored (left out) if it's not used.
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", "colon", "slash", "bslash", "dash", "quote", "squote" b) any symbol or word without spaces. Default = "tab".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M pointreplace -file myfile.txt -row 4 -column 8 -newtext 10.7 -ifnot NA -delim space -out results.txt
Text of row 4 and column 8 is replaced with "10.7" provided that the current entry is not "NA".

PREAPPEND

This function allows to prepend or append text to entries. It is possible to automatically prepend or append text to achieve a certain final length.

./LFE32 -M preappend -file -text -column -function -extend -delim -header -out

-file = input file name. REQUIRED.
-text = what text to prepend or append. REQUIRED. Options: any text. Default: no default.
-column = column number. Options: a) any valid integer column number, b) "0" (means that all columns are used). Default: 0.
-function = prepend or append text to entries. Options: "prepend", "append". Default: "prepend".
-extend = extend entries to this length using text from the '-text' switch. Options: any positive integer. Default: 0. Note: Use "0" or disregard this tag if you want to prepend or append the text once. Extending never exceeds the specified value, however it may stop before the specified value if the text is longer than the number of unused spaces. Example: using '-text house -extend 2' will not append/prepend anything because 2 is shorter than the length of the word "house".
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", "colon", "slash", "bslash", "dash", "quote", "squote", b) any symbol or word without spaces. Default = "tab".
-header = file header. Options: "yes", "no". Default = "no".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M preappend -file myfile.txt -text - -column 2 -function append -extend 5 -delim space -header yes -out results.txt
Here dash is appended to column 2 entries until they become 5 characters long. Those that are longer than 5 characters remain unaltered.

RANDOMIZEROWS

This function randomizes a list by changing the order of rows. The number of rows in the output file can be changed to smaller - in this case a random sampling of rows is written into the output file.

./LFE32 -M randomizerows -file -n -header -out -out-remaining

-file = input file name. REQUIRED.
-n = the length (number of rows) of the output file (excluding the header, if it exists). Options: integer. Default = "no". Note: n only determines the size of the output, all rows of the original file have equal chance of getting chosen.
-header = file header. Options: "yes", "no". Default = "no". Note: If "yes" is selected, the header is extracted and dumped.
-out = output file name. Default = input file name + ".out".
-out-remaining = output file name for the rows that are not selected. Note: If this tag is not used, the corresponding file is not created.

Example:
./LFE32 -M randomizerows -file myfile -n 10 -header yes -out results.txt -out-remaing leftover.txt
Here the output file is 10 entries long or as large as the input file - whichever value is smaller. The remaining rows are dumped into leftover.txt. The header is extracted and not used for anything.

RANGEMATH

This function performs the same functions as Excel's "sum()" and "average()". It allows to add together or average values accross columns. The old columns are replaced with one new one. Missing values supported. Similar LFE functions: MATH, SINGLEMATH.

./LFE32 -M rangemath -file -columns -function -header -delim -missing -out

-file = input file name. REQUIRED.
-columns = column range. REQUIRED. Notes: columns should be separated by commas and ranges by dashes (see the example below); there should be no spaces. Multiple ranges supported but the ranges must be valid and non-overlapping.
-function = mathematical function. Options: "add", "sum" (add=sum), "mean", "avg" (mean=avg). Default: "add"/"sum". Notes: "add"/"sum" adds the column contents together; "mean"/"avg" calculates the mean value accross the specified column range; missing values are not considered (see below).
-header = file header. Options: "yes", "no". Default = "no".
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", b) any symbol or word without spaces. Default = "tab".
-missing = what is used to indicate the missing values. Options: a) any symbol or word without spaces, b) "none". Default = "NA". Note: "none" means that the missing entry fields are empty.
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M rangemath -file mydata.txt -columns 2-5,8-11 -function avg -header yes -delim space -missing INVALID -out results.txt
Here the values in columns 2-5 and 8-11 are averaged.

REMOVEEMPTY

This function removes empty rows or empty entries from files by replacing them with nothing. (Empty is defined as to only contain spaces). It can also remove empty fields from the file, as well as clip empty spaces from the front and back of the entries. This function was designed for quality control of the input data. Note: This function can also be used to convert the file encoding between OS's (such as from Windows to Linux).

./LFE32 -M removeempty -file -remove-from -header -delim -out

-file = input file name. REQUIRED.
-remove-from = where the empty entries are removed from. Default = "all". Options 1: "all", "columns", "rows", "none". Options 2: "clip", "collapse". Note 1: "rows" removes only empty rows, "columns" removes empty entries from the table, "all" removes all empty rows and entries; these options will never remove empty parts from the entries, they only remove rows or entries that are completely empty; "none" does not remove anything, it simpy rewrites the file (this can be used to convert encoding between different OS's). Note 2: "clip" removes empty characters from before and after each entry in the table (e.g. " name " will become "name"), "collapse" removes the entries that contain nothing or contain empty spaces (attn: since this option collapses columns within the row, the properties of the entire table may change; use this option with caution, it is more suitable for simple text than data tables).
-header = input file header. Options: "yes", "no". Default = "no".
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", b) any symbol or word without spaces. Default = "tab".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M removeempty -file myfile -remove-from rows -header yes -delim semicolon -out result.txt
This example removes only empty rows from the file.

REPLACE

This function performs the classical find and replace with a few extra features. It is possible to find and replace a) an entire entry in a table (called entry), b) any part of an entry (called text) in a table, or c) replace an entire entry if the original entry contains a certain motif. The text replacement is done before the entry replacement thereby allowing to precondition the entries before they are replaced. Additionally this function can fill empty data fields with any text. It does not remove empty rows.

./LFE32 -M replace -file -orig-text -new-text -orig-entry -new-entry -column -delim -header -out -repeat

-file = input file name. REQUIRED.
-orig-text = any text (combination of symbols) that needs to be changed in any entry. Options: a) any text without spaces, b) "none", "space", "dspace" (double space), "semicolon", "dquote" (double quote mark), "squote" (single quote mark), "star" (*), "cutout" (/.../), "period" (.), "underscore" (_). Notes: When "none" is used, all missing entries will be replaced with what is defined by '-new-text'.
-new-text = any text (combination of symbols) that is used instead of what is is defined by '-orig-txt'. Options: a) any text without spaces, b) "none", "space", "dspace" (double space), "semicolon", "dquote" (double quote mark), "squote" (single quote mark), "star" (*), "cutout" (/.../), "period" (.), "underscore" (_). Notes: When "none" is used, the text defined by '-orig-text' is deleted.
-orig-entry = any text (combination of symbols); the entries that match this perfectly are going to be replaced with what is defined by '-new entry'. Options: a) any text without spaces, b) "none", "space", "dspace" (double space), "semicolon", "dquote" (double quote mark), "squote" (single quote mark), "star" (*), "cutout" (/.../), "period" (.), "underscore" (_). Notes: When "none" is used, all missing entries will be replaced with what is definded by '-new-entry'.
-new-entry = any text (combination of symbols) that is going to replace what is defined in '-orig-entry'. Options: a) any text without spaces, b) "none", "space", "dspace" (double space), "semicolon", "dquote" (double quote mark), "squote" (single quote mark), "star" (*), "cutout" (/.../), "period" (.), "underscore" (_). Notes: When "none" is used, the entries that match what is defined by '-orig-entry' are deleted.
-column = column number. Options: a) "all", b) any valid column number. Notes: the "all" option means that all columns are considered for search and replace Default = "all".
-delim = file delimiter. Options: a) "tab", "space", "semicolon", b) any symbol or word without spaces. Default = "tab".
-header = input file header. Options: "yes", "no". Default = "no".
-out = output file name. Default = input file name + ".out".
-repeat = how many times the replace command is executed. Options: any integer. Default = 1.

Example1:
./LFE32 -M replace -file myfile.txt -orig-text test -new-text none -orig-entry cases -new-entry controls -column 1 -delim space -header yes -out results.txt
In this example search and replace is done for column 1, the header remains unchanged. First the word "test" is deleted from each entry that contain it. Then, the entries that equal "cases" are replaced with "controls". Note that first "test" is removed and then, only if the remaining entry is precisely equal to "cases", is the entry replaced. For example the entry "casestest" will be replaced with "controls" but "cases_test" will not.
Example2:
./LFE32 -M replace -file myfile.txt -orig-entry none -new-entry NA
In this minimal example every missing field is replaced with "NA". The other settings are the default values.
Example3:
./LFE32 -M replace -file myfile.txt -orig-text car -new-entry vehicle -column 2 -out results.txt
In this example column 2 is searched for strings that contain "car". The ones that do are replaced with "vehicle". Note that the entire entry is replaced with "vehicle", not just the word "car".
Example4:
./LFE32 -M replace -file myfile.txt -orig-text space -new-text space -column all -out results.txt -repeat 10
In this example all consequtive spaces are replaced with just one space (and this is carried out 10 times to ensure that all repetitions are removed).

ROWSORT (memory intensive)

This simple function sorts rows in ascending order. It treats all rows as simple lines of continuous text. Attn: Since all rows are stored in the memory, this function may not be usable with large number or numerous columns. For very large files use LFE function LONGSORT.

./LFE32 -M rowsort -file -header -out

-file = input file name. REQUIRED.
-header = input file header. Options: "yes", "no". Default = "no".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M rowsort -file myfile.txt -header yes -out results.txt
All rows, except for the header, are sorted in the ascending order.

RUNSINDEX

This function allows to generate indexes based on "runs" (as in runstest) in specified columns. The first index is 1 and it is incremented up by 1 every time the numbers turn from ascending to descending. Example showing values and its correpoinding runsidex: 1(1),2(1),3(1),4(1),5(1),3(2),4(2),5(2),6(2),1(3),2(3).

./LFE32 -M runsindex -file -delim -column -index-before -header -missing -out

-file = input file name. REQUIRED.
-delim = file delimiter. Options: a) "tab", "space", "semicolon", "colon", b) any symbol or word without spaces. Default = "tab".
-column = column name. Options: valid column number, 0 (means all columns) Default = "0".
-index-before = column number before which the newly generated indexes should go. Options: any integer Default = "1".
-header = input file header. Options: "yes", "no". Default = "no".
-missing = used to indicate the missing values. Options: any symbol or word without spaces. Default = "NA".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M runsindex -file -delim space -column 2 -index-before 5 -header yes -missing na -out results.txt
The runsidex is put before column 5.

SINGLEMATH

This function performs exp, ln, log, sqrt, square, inv, rev, abs and other functions (see below) on any or all columns of your table. It performs add, subtract, divide, multiply if supplemented with a number (example: add10.2 adds 10.2 to the values in the selected columns). If you need to do math with two columns, please refer to the LFE function MATH. If you want to do math accross all columns by the column, refer to the LFE function RANGEMATH. If you need math with two tables (file) use DOUBLEMATH.

./LFE32 -M singlemath -file -function -columns -missing -header -delim -out

-file = input file name. REQUIRED.
-function = mathematical function. REQUIRED. Options: "exp", "ln", "log", "log2", "sqrt", "square", "inv" (inverse), "rev" (reverse sign), "abs" (absolute value), "add", "subtract", "multiply", "divide", "cutabove", "cutbelow". NB: The last 6 options are used in conjunction with a number (example divide4.7 divides numbers by 4.7).
-columns = what columns to use. REQUIRED. Notes: columns should be separated by commas and ranges by dashes (see the example below); there should be no spaces. If you want all columns, use wide enough range (which is allowed to exceed the size of the file).
-missing = what is used to indicate the missing values. Options: any symbol or word without spaces. Default = "NA".
-header = file header. Options: "yes", "no". Default = "no".
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", "colon", "slash", "bslash", "dash", "quote", "squote", b) any symbol or word without spaces. Default = "tab".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M singlemath -file myfile.txt -function ln -columns 2,4-6 -missing na -header yes -delim comma -out results.txt
Natural logarithm is taken from values located in columns 2,4,5,6.

SWAPCOLUMNS

This function swaps two columns in a tabular test file. For more advanced column manipulations see ORGANIZECOLUMNS or SNAP.

./LFE32 -M spawcolumns -file -column1 -column2 -delim -out

-file = input file name. REQUIRED.
-column1 = first column name. REQUIRED. Options: valid column number. Default = no default.
-column2 = second column name. REQUIRED. Options: valid column number. Default = no default.
-delim = file delimiter. Options: a) "tab", "space", b) any symbol or word without spaces. Default = "tab".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M swapcolumns -file myfile.txt -column1 3 -column2 7 -delim , -out results.txt
Columns 3 and 7 are swapped, the file is comma-delimited.

TABLE2LIST

This function converts table to list - one value per row, showing the coordinates of the value. See LIST2TABLE for the opposite functionality.

./LFE32 -M table2list -file -columns -function -missing -delim -header -out

-file = input file name. REQUIRED.
-columns = columns to be converted. REQUIRED. Options: valid column number. Default = no default.
-function = if value coordinates are shown. Options: "nocoord", "showcoord". Default = "nocoord".
-missing = what is used to indicate the missing values. Options: any symbol or word without spaces. Default = "NA".
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", "colon", "slash", "bslash", "dash", "quote", "squote" b) any symbol or word without spaces. Default = "tab".
-header = file header. Options: "yes", "no". Default = "no".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M table2list -file mytable.txt -columns 1-999 -function showcoord -missing NA -delim space -header yes -out mylist.txt
Here mytable.txt is converted into mylist.txt (showing value coordinates); file header is extracted and lost.

TRANSPOSE

This function transposes a table (swaps columns with rows). The process is carried out any number of columns at a time. For small tables transposing all columns in one go gives maximal speed. However, very large tables may be too large to fit in the memory; these tables need to be transposed a smaller number of columns at a time. This function allows to select the maximum number of columns stored in the memory during transposition. This approach allows handling very large tables (data sets). Note: the capability to handle very large files comes at the cost of reduced speed.

./LFE32 -M transpose -file -columns -col-chunk -delim -out

-file = input file name. REQUIRED.
-columns = number of columns in the table. Options: any positive integer, 0 (means that max column number is found automatically), -1 (means that the column number is calculated based on the first row). Default = 0.
-col-chunk = number of columns transposed in one iteration. Options: any integer. Default = 1000. Note: Use larger values for smaller data sets and smaller numbers for very large data sets.
-delim = file delimiter. Options: a) "tab", "space", "comma", "semicolon", "colon", "slash", "bslash", "dash", "quote", "squote", b) any symbol or word without spaces. Default = "tab".
-out = output file name. Default = input file name + ".out".

Example:
./LFE32 -M transpose -file mytable.txt -columns 0 -col-chunk 10 -delim space -out results.txt
The file is transposed 10 columns at a time, maximal column number is used as the width of the column (-columns 0).

ToomasHaller.com 2017

LFE - file editing commands (to change file content)

LFE location on the UT cluster: /gpfs/gvgpfs/gvhome/toomasha/SOFT/LFE64All others please DOWNLOAD LFE HERE

MULTBYE (specialized)

ROWSORT (memory intensive)

LFE location on the UT cluster: /gpfs/gvgpfs/gvhome/toomasha/SOFT/LFE64
All others please DOWNLOAD LFE HERE