Copyright | (c) CNRS 2017 |
---|---|
License | AGPL + CECILL v3 |
Maintainer | team@gargantext.org |
Stability | experimental |
Portability | POSIX |
Safe Haskell | None |
Language | Haskell2010 |
Gargantext.Core.Text.Corpus.Parsers
Description
Gargantext enables analyzing semi-structured text that should be parsed in order to be analyzed.
The parsers suppose we know the format of the Text (TextFormat data type) according to which the right parser is chosen among the list of available parsers.
This module mainly describe how to add a new parser to Gargantext, please follow the types.
Synopsis
- data FileFormat
- clean :: ByteString -> ByteString
- parseFile :: FileFormat -> FilePath -> IO (Either String [HyperdataDocument])
- cleanText :: Text -> Text
- parseFormat :: FileFormat -> ByteString -> IO (Either String [HyperdataDocument])
Documentation
data FileFormat Source #
According to the format of Input file, different parser are available.
Instances
Show FileFormat Source # | |
Defined in Gargantext.Core.Text.Corpus.Parsers Methods showsPrec :: Int -> FileFormat -> ShowS # show :: FileFormat -> String # showList :: [FileFormat] -> ShowS # |
clean :: ByteString -> ByteString Source #
parseFile :: FileFormat -> FilePath -> IO (Either String [HyperdataDocument]) Source #
Parse file into documents TODO manage errors here TODO: to debug maybe add the filepath in error message
parseFormat :: FileFormat -> ByteString -> IO (Either String [HyperdataDocument]) Source #