gargantext-0.0.7.1.5.3: Search, map, share
Copyright(c) CNRS 2017-Present
LicenseAGPL + CECILL v3
Maintainerteam@gargantext.org
Stabilityexperimental
PortabilityPOSIX
Safe HaskellSafe-Inferred
LanguageHaskell2010

Gargantext.Core.Text.Metrics.Count

Description

Token and occurrence

An occurrence is not necessarily a token. Considering the sentence: "A rose is a rose is a rose". We may equally correctly state that there are eight or three words in the sentence. There are, in fact, three word types in the sentence: "rose", "is" and "a". There are eight word tokens in a token copy of the line. The line itself is a type. There are not eight word types in the line. It contains (as stated) only the three word types, a, is and rose, each of which is unique. So what do we call what there are eight of? They are occurrences of words. There are three occurrences of the word type a, two of is and three of rose. Source : https://en.wikipedia.org/wiki/Type%E2%80%93token_distinction#Occurrences

Synopsis

Documentation

type Occ a = Map a Int #

type Cooc a = Map (a, a) Int #

type FIS a = Map (Set a) Int #

data Group #

Constructors

ByStem 
ByOntology 

type Grouped = Stems #

type Occs = Int #

type Coocs = Int #

type Threshold = Int #

removeApax :: Threshold -> Map ([Text], [Text]) Int -> Map ([Text], [Text]) Int #

cooc :: [[Terms]] -> Map ([Text], [Text]) Int #

coocOnWithLabel :: (Ord label, Ord b) => (a -> b) -> (b -> label) -> [[a]] -> Map (label, label) Coocs #

coocOn :: Ord b => (a -> b) -> [[a]] -> Map (b, b) Int #

coocOn' :: Ord b => (a -> b) -> [a] -> Map (b, b) Int #

coocOnContexts :: (a -> [Text]) -> [[a]] -> Map ([Text], [Text]) Int #

coocOnSingleContext :: (a -> [Text]) -> [a] -> [(([Text], [Text]), Int)] #

occurrences :: [Terms] -> Map Grouped (Map Terms Int) #

Compute the grouped occurrences (occ)

occurrencesOn :: (Ord a, Ord b) => (a -> b) -> [a] -> Map b (Map a Int) #

occurrencesWith :: (Foldable list, Ord k, Num a, Show k, Show a, Show (list b)) => (b -> k) -> list b -> Map k a #

sumOcc :: Ord a => [Occ a] -> Occ a #