Package org.gbif.utils.file
Class FileUtilsTest
java.lang.Object
org.gbif.utils.file.FileUtilsTest
- Author:
- markus
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic voidassertUnixSortOrder(File sorted) static voidvoidvoidtests deleting directory recursively.voidvoidvoidvoidTest sorting multiple fils into a single file.voidTest sorting multiple files into a single file.voidtestSort()Test using GNU sort (if available on this platform).voidvoidtests sorting mac line endings \r which don't work with unix sortvoidTests sorting by a column with uneven length strings as the sort column.voidSorting strings containing characters which are surrogate pairs, meaning Unicode characters beyond U+FFFF, will give different results between GNU Sort and a Java String comparator.voidtests sorting unix line endings \n which work with unix sortvoidtests sorting windows line endings \r\n which work with unix sortvoidvoidvoidIf only columns containing delimiters are quoted in CSV, we can't use GNU sort.voidTest that ensures the chunk file is deleted at the end of sortInJava method.
-
Constructor Details
-
FileUtilsTest
public FileUtilsTest()
-
-
Method Details
-
assertUnixSortOrder
- Throws:
IOException
-
assertUnixSortOrder
- Throws:
IOException
-
humanReadableByteCountTest
-
testDeleteRecursive
tests deleting directory recursively.- Throws:
IOException
-
testMergeSortedFilesSpeed
@Test @Disabled("Run manually to check the performance of merging sorted files.") public void testMergeSortedFilesSpeed() throws IOException- Throws:
IOException
-
testMergeSortedFiles
- Throws:
IOException
-
testMergeEmptyFiles
- Throws:
IOException
-
testSortingHeaderlessFile
- Throws:
IOException
-
testSortingUnicodeFile
Sorting strings containing characters which are surrogate pairs, meaning Unicode characters beyond U+FFFF, will give different results between GNU Sort and a Java String comparator. "fl LATIN SMALL LIGATURE FL" is U+FB02. "ð LINEAR B IDEOGRAM B241 CHARIOT" is U+100CD. GNU sort will use this order, based on the value of the whole character. Java represents ð as a surrogate pair 𐃍 in UTF-16, and sorts based on parts of pairs. Therefore, it gives the wrong order.- Throws:
IOException
-
testSortingMac
tests sorting mac line endings \r which don't work with unix sort- Throws:
IOException
-
testSortingUnix
tests sorting unix line endings \n which work with unix sort- Throws:
IOException
-
testSortingWindows
tests sorting windows line endings \r\n which work with unix sort- Throws:
IOException
-
testSortingUnevenLengths
Tests sorting by a column with uneven length strings as the sort column. The order musn't be different depending whether the column is last or not. The "-k×,×" argument to sort is essential here, otherwise the delimiter from the following column is part of the sort order.- Throws:
IOException
-
testSortingWithHeaders
- Throws:
IOException
-
testSortingWithNonFirstIdColumn
- Throws:
IOException
-
testSortingWithQuotedDelimiters
If only columns containing delimiters are quoted in CSV, we can't use GNU sort. X,"Look, now!",1 X,Why should I,2- Throws:
IOException
-
testSortInJava
Test that ensures the chunk file is deleted at the end of sortInJava method. Otherwise, unwanted chunk files will be left over.- Throws:
IOException
-
testSort
Test using GNU sort (if available on this platform).- Throws:
IOException
-
testMultiFileSort
Test sorting multiple fils into a single file. First column, so GNU sort.- Throws:
IOException
-
testMultiFileSort2ndColumn
Test sorting multiple files into a single file. Second column, so Java sort.- Throws:
IOException
-