public class TabularFileMetadataExtractor extends Object
TabularFileMetadata
from a tabular file.Modifier and Type | Method and Description |
---|---|
static TabularFileMetadata |
extractTabularFileMetadata(Path filePath)
Extract metadata from a tabular file using a sample (defined by
MAX_SAMPLE_SIZE ) of the file. |
static Optional<Character> |
getDelimiterChar(List<String> sample)
Given a sample of line, this method tries to determine the delimiter char used.
|
public static TabularFileMetadata extractTabularFileMetadata(Path filePath) throws IOException, UnknownCharsetException
MAX_SAMPLE_SIZE
) of the file.
The extraction process is based on the frequency of character in the sample using 3 different approaches.
The method will not return any default value if no delimiter and/or quote character can be found in the sample.
The caller should decide which default values should be used to read the file.filePath
- a Path
pointing to a file (not a folder).TabularFileMetadata
, never null (but the content can be null).IOException
UnknownCharsetException
public static Optional<Character> getDelimiterChar(List<String> sample)
sample
- Copyright © 2024 Global Biodiversity Information Facility (GBIF). All rights reserved.