gargantext- Search, map, share
Copyright(c) CNRS 2017-Present
LicenseAGPL + CECILL v3
Safe HaskellNone



Token and occurrence

An occurrence is not necessarily a token. Considering the sentence: "A rose is a rose is a rose". We may equally correctly state that there are eight or three words in the sentence. There are, in fact, three word types in the sentence: "rose", "is" and "a". There are eight word tokens in a token copy of the line. The line itself is a type. There are not eight word types in the line. It contains (as stated) only the three word types, a, is and rose, each of which is unique. So what do we call what there are eight of? They are occurrences of words. There are three occurrences of the word type a, two of is and three of rose. Source :



type Occ a = Map a Int Source #

type Cooc a = Map (a, a) Int Source #

type FIS a = Map (Set a) Int Source #

data Group Source #



type Occs = Int Source #

type Coocs = Int Source #

cooc :: [[Terms]] -> Map ([Text], [Text]) Int Source #

coocOnWithLabel :: (Ord label, Ord b) => (a -> b) -> (b -> label) -> [[a]] -> Map (label, label) Coocs Source #

coocOn :: Ord b => (a -> b) -> [[a]] -> Map (b, b) Int Source #

coocOn' :: Ord b => (a -> b) -> [a] -> Map (b, b) Int Source #

coocOnContexts :: (a -> [Text]) -> [[a]] -> Map ([Text], [Text]) Int Source #

coocOnSingleContext :: (a -> [Text]) -> [a] -> [(([Text], [Text]), Int)] Source #

occurrences :: [Terms] -> Map Grouped (Map Terms Int) Source #

Compute the grouped occurrences (occ)

occurrencesOn :: (Ord a, Ord b) => (a -> b) -> [a] -> Map b (Map a Int) Source #

occurrencesWith :: (Foldable list, Ord k, Num a) => (b -> k) -> list b -> Map k a Source #

sumOcc :: Ord a => [Occ a] -> Occ a Source #